Assignment 4

All the steps are outlined here, but here’s a link to the script that will run everything at once.

In which year did I observe the most individual birds? How many?

yearly.df = df %>%
  #mutate(year = as.character(year)) %>% 
  group_by(year) %>% 
  summarise(yearly_total = sum(count), .groups="drop") %>%
  mutate(year = as.numeric(year)) %>% 
  arrange(desc(yearly_total)) 

# max_bird_year = yearly.df$year[1] # if you arrange 

max_bird_year =  yearly.df$year [ which(yearly.df$yearly_total == max(yearly.df$yearly_total)) ]  # if you don't arrange

cat("You observed the most individual birds in", max_bird_year )

## You observed the most individual birds in 2014

In that year how many different species of birds did I observe?

cat("You observed", length(  unique( filter(df, year == max_bird_year)$scientific_name ) ), "species of bird in", max_bird_year  )

## You observed 210 species of bird in 2014

In which state did I most frequently observe Red-winged Blackbirds?

b.bird = df %>% filter( common_name == "Red-winged Blackbird" ) %>% 
  mutate(state = substr(location, 4,5)) %>% 
  ungroup() %>% 
  group_by(state) %>% 
  summarise(Count = sum(count), .groups="drop") %>%
  arrange( desc(Count) ) 

# many other states than just these ones, but did not observe R-w Bbirds here so they get filtered out. 

cat("You observed the most Red-winged blackbird most frequently in", b.bird$state[1], ", with", b.bird$Count[1], "total birds.")

## You observed the most Red-winged blackbird most frequently in MO , with 596 total birds.

Filter observations for a duration between 5 and 200 minutes. Calculate the mean rate per checklist that I encounter species each year. Specifically, calculate the number of species in each checklist divided by duration and then take the mean for the year.

df %>% filter(duration >= 5, duration <= 200) %>% 
  group_by(list_ID, year) %>%
  summarise(num_of_unique = length(common_name), .groups="drop") %>%  # number of species in each checklist 
  group_by(year) %>%
  summarise(num_of_lists = length(unique(list_ID)),
            Mean_species = mean(num_of_unique), .groups="drop") %>%  # mean for each year
  arrange(year)

## # A tibble: 13 x 3
##     year num_of_lists Mean_species
##    <int>        <int>        <dbl>
##  1  2003            2         4.5 
##  2  2004           41         4.32
##  3  2009            1         8   
##  4  2013            1        14   
##  5  2014           77        19.1 
##  6  2015           38        15.9 
##  7  2016            6        19.3 
##  8  2017           55        22.0 
##  9  2018           22        17.7 
## 10  2019           13        20.9 
## 11  2020           40        17.1 
## 12  2021           45        15.1 
## 13  2022           14        13.6

Create a tibble that includes the complete observations for the top 10 most frequently observed species. First generate a top 10 list and then use this list to filter all observations. Export this tibble as a .csv file saved to a folder called “Results” folder within your R project and add link to the markdown document.

# generate a list of top ten most observed (i.e. highest count)
tops_df = df %>% group_by(common_name) %>% 
  summarise(num_observed = sum(count), .groups="drop") %>%  # summarise across all lists/states/etc. 
  arrange( desc(num_observed) )

topten_list = tops_df$common_name[1:10]

topten_df = df %>% filter(common_name %in% topten_list)
knitr::kable(head(topten_df), "simple")

X	list_ID	common_name	scientific_name	date	time	count	duration	location	latitude	longitude	count_tot	month	year
23	S21177034	Canada Goose	Branta canadensis	2015-01-03	02:00 PM	4	90	US-IL	39.92433	-91.41487	11	1	2015
24	S37097000	Canada Goose	Branta canadensis	2017-05-23	02:06 PM	1	50	US-VT	44.09702	-73.34205	44	5	2017
25	S1740462	Canada Goose	Branta canadensis	2004-10-05	05:00 PM	2	0	US-VT	43.78300	-73.31543	12	10	2004
26	S22598375	Canada Goose	Branta canadensis	2015-03-30	04:30 PM	65	30	US-VT	43.78300	-73.31543	81	3	2015
27	S22635608	Canada Goose	Branta canadensis	2015-04-01	04:30 PM	2	20	US-VT	43.78300	-73.31543	16	4	2015
28	S37075088	Canada Goose	Branta canadensis	2017-05-22	06:17 PM	1	9	US-VT	43.78300	-73.31543	61	5	2017

# write.csv(topten_df, "Results/top_ten_species.csv")

Assignment 4

Emily Adamic

9/29/2022