Data Analysis – Portfolio Project

June - July 2024

PROJECTS

11/9/20242 min read

The “Coding for Data” course from The Global Career Accelerator introduced me to Data Analysis using SQL and Python. We participated into two projects that we completed in groups.

First project

In the first project, we were given three datasets from Intel's Sustainability Team, and the goal was to use SQL queries to analyze and investigate the data to help the team to select the best location for a new data center.

The first task of this project was to identify regions that are net energy producers since some regions purchase power from other regions, while others sell their surplus to regions in need. Next, we generated a new table using Intel’s energy data table before visualizing “renewable energy” and “fossil fuels” trends in Tableau Cloud. After that, Intel provided us with additional data to reach the best conclusion about the location of its next data center. In this task, we aggregated two datasets containing Intel’s power plants data, then proceeded to perform our analysis.

Once our data analysis using SQL was completed, we moved into visualizing and analyzing the data using Tableau Cloud and investigated the best regions for Intel to put its next data center. Once the investigation was completed, we wrote a short report about our findings and recommended the best logical location for Intel’s data center.

Second Project

In the second project, we were given two data frames from the “Recording Academy” and the “Grammy Awards” websites to analyze using Python, with the Pandas and Plotly libraries.

The first task was to load the two data frames and to familiarize ourselves with the data with the Pandas library. Next, we used the Plotly library to create a line chart to visualize the number of users on the site for every day in the year, and the output showed more visitors during the spring season and towards the end of winter season.

For our next task, we generated a new data frame from the two original ones using specific metrics and compare them to provided data about when the two websites used to be one. The output shows that user engagement was more varied when the websites were combined, and became more stable after the websites were split. This analysis also involved measuring of the percentage of users who visit the site but never interact with it and leave.

After these analyses, we shifted our focus to age demographics to see which audiences resonate the most with the websites. After investigation, we discovered that on average, there are more visitors of both websites between the ages of 18 and 25, therefore, the younger the audience, the more interested they are. Once all these tasks were completed, our recommendation was to keep the websites separate to allow users to interact with each website depending on the content they are looking for.