Data Visualization Analysis

Posts

Showing posts from February, 2026

Visualizing Distribution in RStudio

February 27, 2026

For this weeks assignemnt, we were tasked with showing visual distribution using RStudio. I chose the built-in dataset mtcars, which contains many different attributes about 32 different cars. To show the distribution, I created a histogram using ggplot 2 to visualize the distribution of one of the most important things when searching for a new car, miles per gallon (mpg). Looking at the histogram, we can see that most vehicles fall between 15-25 mpg, with a few cars achieving above 30mpg. While designing this visualization, I followed Few's recommendations by just keeping the layout simple, using consistent coloring, and labeling the axis clearly. I do agree that sometimes distributions can be overcomplicated, which can cause them to be as clear to read. In my case, I try to make it as simple as possible while remaining visually engaging and appealing.

Basic Visualization in Rstudio

February 22, 2026

For this assignment, we did some basic visualisations in RStudio. Here we can first see a pie chart showing counts and colors. For the pie chart and the boxplot, we did 40, 30, 20, 10. This analysis is quite simple, as its main focus is to summarize and compare instead of making predictions or testing a hypothesis. The pie chart fits less strongly with Few's typical critique of pie charts, while it still communicates a valid idea of part-to-whole, since the total adds up to 100. In the pie chart, people might have a hard time comparing the sizes accurately, especially since some values are close to each other. For example its hard to see an accurate difference between 30 and 20, while piecharts are good at showing the different proportions, they can be hard to accurately compare. On the other hand, I think the boxplot fits well with the ideas by Few and Yau. The boxplot makes the comparison clear as they all share a common baseline, so it's easier to see the dif...

Using Plotly for Data Visualization

February 13, 2026

Using the given dataset, I went with Plotly to analyze the dataset as its simplicity to use and customization grabbed my attention. I have to say, after using it for a bit and playing around with it, Plotly is a very intuitive tool to have in the data science field. I used these two charts to compare the difference between part-to-whole charts and ranking charts. The ranking chart helped me visualize closely how each ID compared to the average position. By having them in an ascending order, we can see which ones are performing better relatively to others. This type of visualization is best put to use when comparing data within the same dataset. On the other hand, the part-to-whole chart shows how each ID contributes to the total average position. The main focus of this visualization is to show proportion and distribution. We can see how using the same dataset can be interpreted differently depending on which type of visualization we apply to it.

Trends in Ridership and Safety Over Time

February 07, 2026

In this week's assignment, we were tasked with selecting six variables from the given dataset and transforming them into a time visualization. I selected variables that show the system's usage, safety outcomes, and operational activity. I think the variables I selected work well for this time visualization as they change over time. The main patterns this visualization shows are the changes in ridership and operational activity over time. It shows periods of steady growth and periods of decline; we can see a consistent, steady decline in ridership. We can also see that safety trends do not always directly align with the keveks if rudership. All these separate variables come together to form one picture and help us show the change over time in transit demand and safety outcomes. Tablue Visualization Link: https://public.tableau.com/app/profile/jonathan.gonzalez8009/viz/TrendsinRidershipandSafetyOverTime/Sheet1?publish=yes