General 1 — Ecology On Fire

Analyzing Tree Size Structure to Understand
Fire Effects on Population Dynamics

An Introduction to Excel

Synopsis:

Tree diameter measurements reveal the demographic history written into forest stands. They can help us determine whether populations exhibit continuous recruitment or regeneration pulses. In this workshop, you'll learn the ecological theory behind size structure analysis and master Excel techniques to transform DBH data into frequency distributions that expose recruitment bottlenecks and fire effects. Using Caribbean pine data from Belize and your own field measurements, you'll calculate population statistics, create publication-quality graphs, and interpret what different distribution shapes tell us about past disturbance regimes. Students interested in learning these analyses in R programming can download the code and student guide to running R.

In this workshop, you will:

Explore some example data from Belize and perform quality control
Create size class distributions for each treatment
Calculate summary statistics, including mean and standard deviation
Create graphs to compare DBH distributions between treatments
Test whether there are statistical differences between treatments
Interpret results in the context of fire ecology theory
Gain experience using R with the Swirl Package and repeat the Excel activity in R Statistics.

**Caribbean Pine (Pinus caribbea)**

In collaboration with fire managers in Paynes Creek National Park, Belize, researchers examined the size distribution of 237 Caribbean pine trees. Methods involved sampling six 20×50m plots in 2017, with 3 plots in open savanna environments (low fire intensity, sparse pine canopy, abundant herbaceous cover, little litter) and 3 plots in forest environments (high fire intensity, variable pine overstory, primarily shrub groundcover with some bunchgrasses). Column Definitions:

A: EntryID: Sequential numbers
B: Plot: Plot number (identifies different locations/treatments)
C: Veg: Vegetation/treatment type (savanna and forest)
D: Tag: Individual tree identification number
E: DBH2017: Diameter at breast height in 2017 (cm)

Reference: Fill, J.M., M. Muschamp, F. Tricone, R.M. Crandall, and R. Anderson. 2025. Large trees are most influential for long-term persistence of Caribbean pine (Pinus caribaea var. hondurensis) populations in lowland Belize savannas. Biotropica, DOI: 10.1111/btp.70019.

Download Data

Data Import

Materials Needed

Files:

CPineData_RawData_forPH.csv (provided)
Excel_Pine_Analysis.xlsx (you'll create this)
Accompanying PowerPoint Presentation (provided)
See here for additional student resources

Software:

Microsoft Excel 2016+ with Analysis ToolPak enabled
To enable the Analysis ToolPak in Excel, go to Home → Add-ins. Search for “Analysis ToolPak'.“ Select “Add.”

Preparation:

Download CPineData_RawData_forPH.csv to your computer
Create a new folder: Caribbean_Pine_Analysis
Save the CSV file in this folder

Import data:

Double-click on the CSV file: CPineData_RawData_forPH.csv
Save your workbook:
- File → Save As…
- Name it: Caribbean_Pine_Analysis
- Select File Format: Excel Workbook (.xlsx)
- Save in your Caribbean_Pine_Analysis folder
Rename the sheet:
- Right-click on "Sheet1" tab at bottom
- Select "Rename"
- Type: Raw_Data
- Press Enter

MODULE 1: Data Exploration and Quality Control

Part A: Initial Data Exploration

Step 1: Examine the Data

Click on cell A1 (should say "EntryID")
Press Ctrl + Shift + End to jump to the last cell with data
Checkpoint: How many rows are in the dataset?

Step 2: Explore Each Column

Plot Column (Column B):

Click on any cell in column B (e.g., B2)
Insert a PivotTable:
- Click Insert tab
- Click PivotTable
- Select "New Worksheet"
- Click OK
In PivotTable Fields pane (See below):
- Drag "Plot" to "Rows" box
- Drag "Plot" to "Values" box
- Click on “Sum of Plot” and select “Count”
Result - This shows you the number of trees per plot. See figure below.
Checkpoint: How many unique trees? Which plot has the highest number of trees
Rename this sheet: Right-click sheet tab → Rename → Plot_Summary
Go back to Raw_Data sheet (click tab at bottom)

Veg Column (Column C):

Click on cell C1
Click Data tab → Filter
- Small dropdown arrows appear in each of the header rows
Click the dropdown arrow in "Veg" column
Checkpoint: How many unique values do you see? These are your treatment categories!

DBH2017 Column (Column E):

Click on cell E1
Scroll down to look at the DBH values
Checkpoint: What are the minimum and maximum values? Do you see any unusual values (very small, very large, blank)?

Click Data → Filter again to remove filter (show all data)

MODULE 1: Data Exploration and Quality Control

Part B: Quality Control (QC)Checks

Create a new sheet for organized data:

Click on + at bottom of sheet
Rename it: Organized_Data

Copy data to new sheet:

Go back to Raw_Data sheet
Click cell A1
Press Ctrl + A to select all data
Press Ctrl + C (copy)
Go to Organized_Data sheet
Click cell A1
Press Ctrl + V (paste)

QC Check #1: Missing DBH Values

Click on cell F1 in Organized_Data sheet
Type: QC_Flag
Press Enter
Click on cell F2
Copy-paste or type this formula EXACTLY:
=IF(ISBLANK(E2),"Missing",IF(E2<2,"Too Small",IF(E2>80,"Very Large","OK")))
Press Enter
Click back on F2
Double-click the fill handle (small square at bottom-right of cell; see figure below), which will copy formula down to all rows
Click on cell F1 (header)
Click Data → Filter
Click the dropdown in QC_Flag column
Checkpoint: What do you see? Number of "Missing"? Number of "Too Small"? Number of "Very Large"?

QC Check #2: Check Treatment Consistency

Click the dropdown in Veg column. This shows if treatment names are spelled consistently.
Checkpoint: Look at the highlighted cells. Are there variations in spelling or capitalizations (e.g., "Savanna" vs "savanna")? Are there typos? If you find variations, standardize them using Find & Replace
Click Data → Filter again to remove filter (show all data)

QC Check #3: DBH Range by Life Stage

Create a PivotTable: Select all data - Insert → PivotTable → New Worksheet
Set up the PivotTable:
- Drag "Veg" to Rows
- Drag “Plot” to Rows
- Drag "DBH2017" to Values
- Click on "Sum of DBH2017" → Value Field Settings → Average
- Drag "DBH2017" to Values again
- Click on this second one → Value Field Settings → Min
- Drag "DBH2017" to Values a third time
- Click on this one → Value Field Settings → Max
Rename this sheet: Stage_DBH_Check
You should see:

MODULE 2: Size Class Distributions

Part A: Set Up Analysis Parameters

Create a new sheet:

Insert new worksheet
Rename: Parameters

Create parameter table:

In the Parameters sheet, create this table starting at A1, adjusting column widths as necessary. Do not enter anything in []:

2. Here are the formulas to copy-paste:

=COUNT(UNIQUE(Organized_Data!B:B))

=COUNTA(Organized_Data!A:A)-1

MODULE 2: Size Class Distributions

Part B: Create Size Classes

Go back to Organized_Data sheet

Add Size Class Column:

Click on cell G1
Type: Size_Class
Press Enter
Click on cell G2
Type this formula (this is complex - copy carefully!):
=IF(E2<5,"2-5",IF(E2<10,"5-10",IF(E2<15,"10-15",IF(E2<20,"15-20",IF(E2<25,"20-25",IF(E2<30,"25-30",IF(E2<35,"30-35",IF(E2<40,"35-40",IF(E2<45,"40-45",IF(E2<50,"45-50","50+"))))))))))
Press Enter
Click back on G2
Double-click fill handle to copy down.
Checkpoint: Verify it worked: Tree with DBH = 7.5 should show "5-10" Tree with DBH = 23 should show "20-25"Every tree should now have a size class!

MODULE 2: Size Class Distributions

Part C: Calculate Frequency Distributions

Create new sheet:

Insert new worksheet
Rename: Size_Distribution

Step 1: Identify Your Treatments

Go back to Organized_Data sheet
What treatments are in the "Veg" column?

Step 2: Create Frequency Table Structure

In the Size_Distribution sheet:

Row 1 (Headers):

Starting in A1, type these headers (one per column):
A1: Size_Class
B1: Savanna_Count (use YOUR treatment names, e.g., Burned and Unburned instead of Savanna and Forest)
C1: Savanna_%
D1: Forest_Count
E1: Forest_%
(Add more column pairs if you have more treatments)

Column A (Size Classes):

Select entire column and right-click → Select Format Cells → Select Text → Click OK

Starting at A2, type these size classes (one per row):
2-5
5-10
10-15
15-20
20-25
25-30
30-35
35-40
40-45
45-50
50+

Step 3: Count Trees in Each Size Class

CRITICAL: Replace "Savanna" in the formulas below with YOUR actual treatment name from the Veg column!

In cell B2 (count for 2-5 cm, Savanna treatment):

=COUNTIFS(Organized_Data!$C:$C,"Savanna",Organized_Data!$G:$G,"2-5")

Explanation:

Organized_Data!$C:$C = Veg column (treatment)
"Savanna" = name of treatment (CHANGE THIS to your actual treatment name!)
Organized_Data!$G:$G = Size_Class column
"2-5" = size class we're counting

In cell B3 (count for 5-10 cm, Savanna):

=COUNTIFS(Organized_Data!$C:$C,"Savanna",Organized_Data!$G:$G,"5-10")

Continue for ALL size classes (B4 through B12):

Change the last part to match the size class in column A:

B4: 10-15
B5: 15-20
B6: 20-25
B7: 25-30
B8: 30-35
B9: 35-40
B10: 40-45
B11: 45-50
B12: 50+

Repeat for your second treatment (Column D):

In cell D2:

=COUNTIFS(Organized_Data!$C:$C,"Forest",Organized_Data!$G:$G,"2-5")

Continue for D3 through D12, changing "Forest" to YOUR second treatment name.

Step 4: Calculate Percentages

In cell C2:

=B2/SUM($B$2:$B$12)*100

IMPORTANT: The $ signs make the SUM range absolute, so when you copy down, it always sums B2:B12.

Copy this formula down:

Double-click fill handle to copy down

Repeat for your second treatment (Column E):

In cell E2:

=D2/SUM($D$2:$D$12)*100

Double-click fill handle to copy down

Step 5: Add TOTAL row

In A13: Type TOTAL

In B13:

=SUM(B2:B12)

In C13:

=SUM(C2:C12)

In D13:

=SUM(D2:D12)

In E13:

=SUM(E2:E12)

Alternatively, drag the fill handle to the right

Format the TOTAL row:

Select A13:E13
Home → Font → Bold
Home → Font → Borders → Top Border

Checkpoint:

Do your percentages sum to 100%?
Which size class has the most trees in Savanna treatment?
Which size class has the most trees in Forest treatment?

MODULE 3: Summary Statistics and Comparisons

Part A: Calculate Basic Statistics

Create new sheet:

Insert new worksheet
Rename: Summary_Stats

Step 1: Create Statistics Table Structure

Create this table starting at A1:

Step 2: Calculate Statistics for savanna Treatment

IMPORTANT: Change "Savanna" and “Forest” in these formulas to YOUR treatment names!

In B2 (Sample Size):

=COUNTIFS(Organized_Data!$C:$C,"Savanna",Organized_Data!$E:$E,">0")

This counts trees in Savanna treatment that have a DBH > 0.

In B3 (Mean):

=AVERAGEIF(Organized_Data!$C:$C,"Savanna",Organized_Data!$E:$E)

In B4 (Median):

This requires an array formula:

=MEDIAN(IF(Organized_Data!$C:$C="Savanna",Organized_Data!$E:$E))

In B5 (Std Dev):

=STDEV.S(IF(Organized_Data!$C:$C="Savanna",Organized_Data!$E:$E))

Press Ctrl + Shift + Enter

In B6 (Minimum):

For Excel 2019+:

=MINIFS(Organized_Data!$E:$E,Organized_Data!$C:$C,"Savanna")

In B7 (Maximum):

=MAXIFS(Organized_Data!$E:$E,Organized_Data!$C:$C,"Savanna")

In B8 (Range):

=B7-B6

Step 3: Repeat for Forest Treatment (Column C)

Copy all formulas from column B to column C, then use Find & Replace:

Select C2:C8
Press Ctrl + H (Find & Replace)
Find: "Savanna"
Replace: "Forest"
Click "Replace All" (There should be 6 replacements)

Step 4: Calculate Differences (Column D)

In D2:

=C2-B2

Use fill handle to copy down to D3:D8

MODULE 3: Summary Statistics and Comparisons

Part B: Advanced Statistics

Continue in Summary_Stats sheet
Add a couple more rows below your current table, starting at A9 with DISTRIBUTION METRICS:

6. Calculate Coefficient of Variation (CV) and Skewness

In B10:

=(B5/B3)*100

This is: (Std Dev / Mean) × 100

In C10:

=(C5/C3)*100

In D10:

=B10-C10

Interpretation of CV:

Add a column E for Savanna interpretation and a column F for Forest interpretation:

In E10:

=IF(B10<30,"Low (even-aged)",IF(B10<60,"Moderate","High (uneven-aged)"))

Copy to F10 (for Forest interpretation):

=IF(C10<30,"Low (even-aged)",IF(C10<60,"Moderate","High (uneven-aged)"))

Row 11: Skewness

Skewness is trickier in Excel without add-ins.

Simple approximation using Pearson's method:

In B11:

=3*(B3-B4)/B5

This is: 3 × (Mean - Median) / Std Dev

In C11:

=3*(C3-C4)/C5

Interpretation:

Positive skewness (>0.5) = right-skewed, more small trees
Near zero (-0.5 to 0.5) = symmetric
Negative skewness (<-0.5) = left-skewed, more large trees

Add interpretation in E11 for Savanna treatment:

=IF(B11>0.5,"Right-skewed (many small trees)",IF(B11<-0.5,"Left-skewed (many large trees)","Approximately symmetric"))

Add interpretation in E12 for Forest treatment:

=IF(C11>0.5,"Right-skewed (many small trees)",IF(C11<-0.5,"Left-skewed (many large trees)","Approximately symmetric"))

Checkpoint: Let’s discuss your results so far.

MODULE 4: Create Figures

Figure 1: Size Class Distribution Histogram

Go to Size_Distribution sheet
Select your data for the chart:
- Click on cell A1
- Drag to select: A1:A12, C1:C12, E1:E12 (We're selecting Size_Class and the two Percent columns)
- Hold Ctrl and click to select non-adjacent columns
Insert Chart:
- Click Insert tab
- Click Insert 2D Column Chart
- Select Clustered Column (first option)
A chart appears!
Format the chart:

Chart Title:

Click on title
Type: Size Class Distribution by Treatment

Axis Titles:

Chart Design → Add Chart Element → Axis Titles → Primary Horizontal
Click in text box → Delete “Axis title” → Type: Diameter Class (cm)
Chart Design → Add Chart Element → Axis Titles → Primary Vertical
Click in text box → Delete “Axis title” → Type: Frequency (%)
Go to Home and change font style and size

Remove gaps between bars (make it look like a histogram):

Right-click on any bar
Select Format Data Series
Series Options (bar chart icon)
Gap Width: Move slider to 0%

Color the bars:

Right-click on Savanna bars → Format Data Series
Fill & Border (paint bucket icon)
Fill → Solid fill
Color: Choose a color that represents savannas
Right-click on Forest bars
Choose a color that represents forests

Move legend to top:

Click legend
Chart Design → Add Chart Element → Legend → Top

6. Move chart to its own sheet:

Right-click on chart
Move Chart
Select New sheet
Name: Fig1_SizeDistribution
Click OK

Checkpoint:

Does either treatment show a reverse J-shape?
Which treatment shows a more normal distribution?
What does this tell you about recruitment?

Figure 2: Side-by-Side Comparison of Means

Go to Summary_Stats sheet

Select cells A3:C3 (Mean DBH row with treatments)
Insert → Column Chart → Clustered Column
Format the chart:
- Title: Mean DBH by Treatment
- Y-axis: Mean DBH (cm)
- If there is a legend, remove it (only 2 bars, labels on X-axis are clear)
Add labels to bars
- Right-click chart → Select Data
- Horizontal Axis Labels → Edit
- Select B1:C1 (Savanna, Forest)
- Click OK
Add Error Bars (Standard Error):
- Click on one bar
- Chart Design → Add Chart Element → Error Bars → More Error Bar Options
- Error Bar Options:
  - Direction: Both
  - End Style: Cap
  - Error Amount: Custom → Specify Value
- Positive Error Value: Select Std Dev for both treatments
- Negative Error Value: Same values
Play around with bar colors, fonts, and font sizes
Make bars wider:
- Right-click bar → Format Data Series
- Change values for Gap Width

MODULE 5: Statistical Tests

Part A: Descriptive Comparison

Create new sheet:

Insert a new worksheet
Rename: Statistical_Tests
Create a summary comparison:

MODULE 5: Statistical Tests

Part B: F-Test for Equal Variances

This tests if the variability (spread) is different between treatments.

1. In Statistical_Tests sheet, in cell A10, type F-TEST FOR EQUAL VARIANCES

2. Click Data tab → Data Analysis

4. In Data Analysis dialog:

Select F-Test Two-Sample for Variances
Click OK

5. F-Test dialog box:

Variable 1 Range:

Click in the box
Go to Organized_Data sheet
Select the DBH values for savanna: E[first row]:E[last row]
This enters: `Organized_Data!E$137:E$223`
Variable 2 Range:

Click in the box
Go to Organized_Data sheet

Select the DBH values for forest: E[first row]:E[last row]

Labels: Leave unchecked (we didn't select headers)

Alpha: 0.05

Output Range:

Click back on Statistical_Tests sheet
Click cell A12
Click OK

Excel outputs:

F-Test Two-Sample for Variances with various statistics including:

Mean for each variable
Variance for each variable
Observations (number of)
df (degrees of freedom)
F statistic (KEY VALUE)
P(F<=f) one-tail (KEY VALUE)
F Critical one-tail

INTERPRETATION:

In cell C12 (next to output), type: INTERPRETATION
In C13, type:

=IF(B20<0.05,"Variances are SIGNIFICANTLY DIFFERENT (P<0.05)","Variances are EQUAL (P>0.05)")

What it means:

If P < 0.05: Use Welch's t-test (accounts for unequal variances)
If P ≥ 0.05: Use Student's t-test (assumes equal variances)

Checkpoint:

What is the F statistic?
What is the P-value?
Are the variances equal or different?

MODULE 5: Statistical Tests

Part C: T-Test for Mean Difference

This tests if mean DBH is significantly different between treatments.

1. In Statistical_Tests sheet, go to cell A25:

Type: T-TEST FOR MEAN DBH

2. Click Data → Data Analysis → t-Test

Which t-test to choose?

If F-test P < 0.05: t-Test: Two-Sample Assuming Unequal Variances
If F-test P ≥ 0.05: t-Test: Two-Sample Assuming Equal Variances

3. Click OK

4. T-Test dialog:

Variable 1 Range:

Go to Organized_Data sheet
Select DBH values for savanna data

Variable 2 Range:

Go to Organized_Data sheet
Select DBH values for forest data

Hypothesized Mean Difference: 0

Labels: Unchecked

Alpha: 0.05

Output Range: Cell A27 in Statistical_Tests sheet

5. Click OK

Excel outputs:

t-Test results including:

Mean for each variable
Variance for each variable
Observations
df (degrees of freedom) - KEY
t Stat (KEY VALUE)
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail (KEY VALUE - use this one!)
t Critical two-tail

INTERPRETATION:

Add interpretation next to output:

In column C next to the t-test output:
Label cell (e.g., E27): INTERPRETATION
Next cell (e.g., E28):

=IF(B38<0.001,"Means are HIGHLY SIGNIFICANTLY DIFFERENT (P<0.001)",IF(B38<0.01,"Means are SIGNIFICANTLY DIFFERENT (P<0.01)",IF(B38<0.05,"Means are SIGNIFICANTLY DIFFERENT (P<0.05)","Means are NOT significantly different (P>0.05)")))

2. Write your statistical statement:

In cell A40:

="Mean DBH was "&IF(B38<0.05,"significantly ","not significantly ")&"different between treatments (Savanna: "&TEXT(Summary_Stats!B3,"0.0")&" cm, Forest: "&TEXT(Summary_Stats!C3,"0.0")&" cm; t="&TEXT(B35,"0.00")&", df="&TEXT(B34,"0")&", P="&TEXT(B38,"0.000")&")."

This automatically generates a results sentence!

Checkpoint:

What is the t-statistic?
What are the degrees of freedom (i.e., df)?
What is the P-value (two-tailed)?
Are the means significantly different?

MODULE 6: Interpretation Questions

Size Structure: Which treatment has higher CV? What does this indicate about age structure?
Distribution Shape: Which treatment is right-skewed (positive skewness)? What does this mean ecologically? Is this what you'd expect based on the lecture?
Statistical Significance: Are the means significantly different? Is this difference large enough to be ecologically important?
Fire Trap Evidence: Do you see a "missing middle" in any treatment? Which size classes are under-represented? What does this suggest about fire effects?
Discussion questions
- Did your results match the predictions from the lecture? Why or why not?
- What ecological processes explain the patterns you observed?
- What are the management implications of these findings?

MODULE 7: An introduction to R Programming

Swirl is an R package designed to teach R programming and data science interactively within the R console. It provides a series of courses and lessons that guide users through various R concepts and functionalities, and offers a hands-on and interactive way for beginners to learn R programming by practicing directly within the R environment

Using Swirl for R beginners:

Install R and RStudio: Ensure you have R (version 3.1.0 or later) and RStudio (recommended for a better experience) installed on your computer.
Install the Swirl package: Open RStudio and type the following command in the console:
install.packages("swirl")

Load the Swirl package: Load the package into your R session:
library(swirl)

Start Swirl: Initiate the Swirl environment:
swirl()
You will be prompted to enter your name, which allows Swirl to track your progress.

Install a course: The first time you run Swirl, you'll be prompted to install a course. You can choose from recommended courses or explore the Swirl Course Network for more options. For example, to install the "R Programming" course
install_course("R Programming")

Note: Course names are case-sensitive.

Select a lesson:
After installing a course, you can choose a specific lesson to begin. Swirl will present a menu of available lessons. It is recommended that you start with “1: R Programming: The basics of programming in R.”
Engage with the interactive lessons:
Swirl will guide you through the lesson with text explanations, questions, and prompts to enter R code directly into the console.
- Press Enter when you see ... to continue reading.
- Type your responses or R code when you see the R prompt (>) or a specific prompt from Swirl.
- Swirl provides immediate feedback on your code and answers.

Helpful commands within Swirl:

bye(): Exit Swirl and save your progress.
skip(): Skip the current question.
play(): Temporarily exit Swirl to experiment with R in the console.
next(): Regain Swirl's attention after using play().
main(): Return to the main Swirl menu.
info(): Display options and instructions again.

Once you are comfortable with R, you can try replicating the Excel activity from above using this R code and the associated student guide to running R code.

Analyzing Tree Size Structure to Understand Fire Effects on Population Dynamics

Synopsis:

In this workshop, you will:

Caribbean Pine (Pinus caribbea)

Data Import

Materials Needed

MODULE 1: Data Exploration and Quality Control

Part A: Initial Data Exploration

MODULE 1: Data Exploration and Quality Control

Part B: Quality Control (QC)Checks

MODULE 2: Size Class Distributions

Part A: Set Up Analysis Parameters

MODULE 2: Size Class Distributions

Part B: Create Size Classes

MODULE 2: Size Class Distributions

Part C: Calculate Frequency Distributions

MODULE 3: Summary Statistics and Comparisons

Part A: Calculate Basic Statistics

MODULE 3: Summary Statistics and Comparisons

Part B: Advanced Statistics

MODULE 4: Create Figures

MODULE 5: Statistical Tests

Part A: Descriptive Comparison

MODULE 5: Statistical Tests

Part B: F-Test for Equal Variances

MODULE 5: Statistical Tests

Part C: T-Test for Mean Difference

MODULE 6: Interpretation Questions

MODULE 7: An introduction to R Programming

Analyzing Tree Size Structure to Understand
Fire Effects on Population Dynamics

**Caribbean Pine (Pinus caribbea)**