Read and format project data
= "https://raw.githubusercontent.com/byuidatascience/data4names/master/data-raw/names_year/names_year.csv"
names_url = pd.read_csv(names_url) names
Course DS 250
Nefi Melgar
Names as ways to identify individuals have been used through world’s history. This project involves data exploration, visualization, and interpretation to gain insights into the trends and patterns related to different names over time. Is Brittany a name for old or young ladies? Has Luke Skywalker from Star Wars influeced the names of a generation? Let’s dive in to the results.
HOW DOES YOUR NAME AT YOUR BIRTH YEAR COMPARE TO ITS USE HISTORICALLY?
My name was not listed in the data set. So I took my eldest brother’s name which is Matthew, he was born in 1993. Around 1940, Matthew started to increase in popularity, it took 50 years to reach the highest point with more than 35,000 Matthews.. Once the highest point was reached, the name had a decrease in popularity, before 2000 it shew a spark of popularity, but the dicrease continued..
matthew_data = names.query('name == "Matthew"')
# filter data
person_name = "Matthew"
year_range = "year >= 1910 and year <= 2015"
matthew_data = names.query(f"name == '{person_name}' and {year_range}")
# specific year for his born
born_year = 1993
born_year_data = matthew_data.query(f"year == {born_year}")
specific_year = born_year_data["year"].values
totals_specific_year = born_year_data["Total"].values
# create matthew line chart through the years
mat_chart = px.line(
matthew_data,
x="year",
y="Total",
labels={"year": "Year"},
title="Matthew name through the years",
)
# add mark to highligth the year when he was born
mat_chart.add_trace(
go.Scatter(
x=specific_year,
y=totals_specific_year,
mode="markers",
marker_size=5,
name="1993",
)
)
mat_chart.show()
1993 year is highlighted as the year my brother born..
IF YOU TALKED TO SOMEONE NAMED BRITTANY ON THE PHONE, WHAT IS YOUR GUESS OF HIS OR HER AGE? WHAT AGES WOULD YOU NOT GUESS??
type your results and analysis here
# get data and filter it by name
brittany_data = names.query('name == "Brittany"')
current_year = datetime.datetime.now().year
# create a copy of the data to avoid the SettingWithCopyWarning:
brittany_data_copy = brittany_data.copy()
brittany_data_copy["currentAge"] = current_year - brittany_data_copy["year"]
# create brittany scatter
brittany_chart = px.scatter(
brittany_data_copy,
x="currentAge",
y="Total",
labels={
"currentAge": "Current Age",
},
title="Brittany's ages in the US",
)
brittany_chart.show()
Using 2024 as year of reference, Brittany was a popular name many years ago..
| name | year | Total |
|:---------|-------:|--------:|
| Brittany | 1968 | 5 |
| Brittany | 1969 | 12 |
| Brittany | 1970 | 32 |
| Brittany | 1971 | 81 |
| Brittany | 1972 | 158 |
MARY, MARTHA, PETER, AND PAUL ARE ALL CHRISTIAN NAMES. FROM 1920 - 2000, COMPARE THE NAME USAGE OF EACH OF THE FOUR NAMES. WHAT TRENDS DO YOU NOTICE??
These names had an interesting increase in popularity around 1940’s, but for the 1960’s all of them were in their way to decrase in popularity. Mary shows the highest amount of decrease, in around 20 years it went from 45,000 to 15,000, a loss of 30,000, a considerable amount compared to the other names.
# create copy of the data with specified names and year range
list_names = ["Mary", "Martha", "Peter", "Paul"]
year_names = "year >= 1920 and year <= 2000"
christian_names_data = names.query(f"name in @list_names and {year_names}")
# create line chart to display names between 1920 and 2000
christian_names_chart = px.line(
christian_names_data,
x="year",
y="Total",
color="name",
labels={
"year": "Year",
"name": "Name",
},
title="Christian Names Between 1920-2000",
)
christian_names_chart.show()
“Mary” shows an interesting decrease before 1960.
| name | year | Total |
|:-------|-------:|--------:|
| Martha | 1920 | 8705 |
| Martha | 1921 | 9254 |
| Martha | 1922 | 9018 |
| Martha | 1923 | 8731 |
| Martha | 1924 | 9163 |
THINK OF A UNIQUE NAME FROM A FAMOUS MOVIE. PLOT THE USAGE OF THAT NAME AND SEE HOW CHANGES LINE UP WITH THE MOVIE RELEASE. DOES IT LOOK LIKE THE MOVIE HAD AN EFFECT ON USAGE?
Star Wars is a popular series of movies, streamings, videogames and more. Luke Skywalker was the main character for the movies IV (1977), V (1980) and VI (1983). Although “Luke” name started to grow in popularity around 1970, from 1977 to 1980 it went from 1235 to 3108 a growth of 252% in just 3 years, after that the name’s popularity had it’s ups and downs.
# choose name, and year range for the star wars movies, "luke" as the name
movie_name = "Luke"
year_range = "year >= 1960 and year <= 1990"
movie_name_data = names.query(f"name == '{movie_name}' and {year_range}")
# create line chart for luke name through the years
luke_name_chart = px.line(
movie_name_data,
x="year",
y="Total",
labels={"year": "Year"},
title="Luke Skywalker impact on names of the people",
)
# data to display movies released data, movies iv, v and vi
movie_release_year = [1977, 1980, 1983]
movie_release_data = names.query(
f"year in @movie_release_year and name == '{movie_name}'"
)
totals_movie_year = movie_release_data["Total"].values
# add markers for the years when the moview IV to VI were released
luke_name_chart.add_trace(
go.Scatter(
x=movie_release_year,
y=totals_movie_year,
mode="markers",
marker_size=8,
marker_symbol="star",
marker_color="red",
name="Years when Star Wars movies were released (IV - VI)",
)
)
# change the layout so the legend show at bottom of the plot
luke_name_chart.update_layout(
legend=dict(
orientation="h",
yanchor="bottom",
y=-0.4,
)
)
luke_name_chart.show()
Release year for movies IV, V and VI are highlighted..
ANALYZE DATA ABOUT ELLIOT’S NAME, A POPULAR CHARACTER IN E.T.
E.T. is a beloved series of movies, with releases in 1982, 1985, and 2002. Elliot, the main character in the original 1982 film, became an iconic figure associated with the story. The name “Elliot” saw moderate popularity with 363 instances in 1982 and a slight increase to 398 by 1985. By 2002, the name had grown to 503 instances. However, it wasn’t until 1999 that the name “Elliot” started to gain significant traction, with its popularity rising steadily thereafter. The enduring appeal of the character, coupled with the nostalgia surrounding the E.T. series, likely contributed to this upward trend.
# choose name, and year range for Elliot's name
movie_name = "Elliot"
year_range = "year >= 1950 and year <= 2015"
elliot_data = names.query(f"name == '{movie_name}' and {year_range}")
# create line chart for Elliot's name through the years
elliot_chart = px.line(
elliot_data,
x="year",
y="Total",
color="name",
labels={"year": "Year"},
title="Elliot...What?",
)
# data to display movies released data, 1st , 2nd and 3rd
elliot_selected_years_text = {
1982: "E.T. Released",
1985: "Second Released",
2002: "Third Released",
}
et_movie_release_year = [1982, 1985, 2002]
elliot_selected_data = names.query(
f"year in @et_movie_release_year and name == '{movie_name}'"
)
totals_movie_year = elliot_selected_data["Total"].values
# add vertical lines for the years when the movies were released
vlines_loop_count = 0
selected_years_count = len(elliot_selected_years_text)
annotation_position_list = ["top left", "top right", "top right"]
while vlines_loop_count <= selected_years_count:
i = 0
for year, text in elliot_selected_years_text.items():
# https://plotly.com/python/horizontal-vertical-shapes/
elliot_chart.add_vline(
x=year,
line_width=1,
line_dash="dash",
line_color="red",
annotation_text=text,
annotation_position=f"{annotation_position_list[i]}",
)
vlines_loop_count += 1
i += 1
elliot_chart.show()
Release year for movies 1, 2 and 3 are highlighted..