Stock Data Analysis & Visualizations

July 23, 2024 (9mo ago)

Introduction

In this blog post, we'll look into the stock data analysis and visualization using Python. We'll explore various techniques to visualize the dataset, find useful patterns, and gain valuable insights.

Data Overview

We will analyze the stock data of several major companies and the S&P 500 index:

  • AAPL: Apple
  • BA: Boeing
  • T: AT&T
  • MGM: MGM Resorts International
  • AMZN: Amazon
  • IBM: IBM
  • TSLA: Tesla Motors
  • GOOG: Google
  • sp500: S&P 500 Index

Data Visualization and Analysis

Importing Necessary Libraries

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
from copy import copy
from scipy import stats
import plotly.express as px
import plotly.figure_factory as ff
import plotly.graph_objects as go

Loading and Sorting the Data

stocks_df = pd.read_csv('stock.csv')
stocks_df = stocks_df.sort_values(by=['Date'])

Plotting the Raw Stock Prices


# Defining a function to plot the entire dataframe
# The function takes in a dataframe df as an input argument and does not return anything back!
# The function performs data visualization

def show_plot(df, fig_title):
    df.plot(x='Date', figsize=(15,7), linewidth=3, title=fig_title)
    plt.grid()
    plt.show()

show_plot(stocks_df, 'RAW STOCK PRICES (WITHOUT NORMALIZATION)')
Stock Prices Raw

Insight: Tech giants like Apple, Amazon, Tesla, and Google show robust growth trends, reflecting strong market confidence.

Performing Data Visualization

# Function to perform an interactive data plotting using plotly express
# Plotly.express module which is imported as px includes functions that can plot interactive plots easily and effectively.
# Every Plotly Express function uses graph objects internally and returns a plotly.graph_objects.Figure instance.

def interactive_plot(df, title):
 fig = px.line(title = title)

 # Loop through each stock (while ignoring time columns with index 0)
 for i in df.columns[1:]:
   # Add a new Scatter trace
   fig.add_scatter(x = df['Date'], y = df[i], name = i)

 fig.show()

# Plot interactive chart
interactive_plot(stocks_df, 'Prices')
Stock Prices Chart

Highlight: Significant market events, such as the COVID-19 pandemic, lead to observable volatility in stock prices and indices.

Calculating Individual Stock Daily Returns

# Calculating daily return for a single security
df = stocks_df['sp500']
df_daily_return = df.copy()

# Looping through every element in the dataframe
for j in range(1, len(df)):

  # Calculating the percentage of change from the previous day
  df_daily_return[j] = ((df[j]- df[j-1])/df[j-1]) * 100

# Putting zero in the first line item
df_daily_return[0] = 0
df_daily_return
# Defining a function to calculate stocks daily returns (for all stocks)
def daily_return(df):
  df_daily_return = df.copy()

  # Loop through each stock (while ignoring time columns with index 0)
  for i in df.columns[1:]:

    # Loop through each row belonging to the stock
    for j in range(1, len(df)):

      # Calculate the percentage of change from the previous day
      df_daily_return[i][j] = ((df[i][j]- df[i][j-1])/df[i][j-1]) * 100

    # set the value of first row to zero since the previous value is not available
    df_daily_return[i][0] = 0

  return df_daily_return
# Getting the daily returns
stocks_daily_return = daily_return(stocks_df)
stocks_daily_return

Calculating Correlations Between Daily Returns

Calculating Correlations Between Daily Returns
plt.figure(figsize=(10, 10))
ax = plt.subplot()
sns.heatmap(cm, annot = True, ax = ax);
Stock Prices Correlations

Correlation: High positive correlations among tech stocks suggest interconnected market movements and potential benefits of diversification.

Histogram for Daily Returns

# Stock returns are normally distributed with zero mean
stocks_daily_return.hist(figsize=(10, 10), bins = 40);
Stock Prices Correlations
# Grouping all data returns together in a list
# Making a copy of the daily returns dataframe
df_hist = stocks_daily_return.copy()

# Dropping the date
df_hist = df_hist.drop(columns = ['Date'])

data = []

# Looping through every column
for i in df_hist.columns:
  data.append(stocks_daily_return[i].values)
data

# Plotly's Python API contains a super powerful module known as figure factory module
# Figure factory module includes wrapper functions that create unique chart types such as interactive subplots
fig = ff.create_distplot(data, df_hist.columns)
fig.show()
Normalized Stock Prices

Normalization: Comparing normalized stock prices reveals varying magnitudes of growth, with Tesla demonstrating remarkable performance.

Additional Visualizations

# Plotting the daily returns
show_plot(stocks_daily_return, 'STOCKS DAILY RETURNS')
Stock Prices Daily Returns

Daily Returns: The distribution of daily returns provides insights into stock volatility, with some stocks showing more stability than others.

Conclusion

The visualizations and analyses conducted on the stock data provide several valuable insights:

  1. Trend Analysis: By visualizing raw stock prices over time, we observed distinct upward trends for tech giants like Apple (AAPL), Amazon (AMZN), Tesla (TSLA), and Google (GOOG). This indicates robust growth and market confidence in these companies. Conversely, companies like Boeing (BA) and IBM showed more fluctuation, reflecting industry-specific challenges and changes over the years.

  2. Impact of Market Events: The daily returns analysis highlighted significant volatility during certain periods, which often correlates with major market events. For example, notable dips in the S&P 500 index (sp500) can be linked to global economic events, such as the financial impacts of the COVID-19 pandemic in early 2020.

  3. Cross-Stock Correlations: The correlation heatmap provided insights into how the stocks move relative to each other. High positive correlations between stocks like AMZN, AAPL, and GOOG suggest that these tech stocks often move in tandem, likely due to their interconnected business models and market influences. Meanwhile, lower or negative correlations with stocks from different sectors (e.g., MGM from the hospitality industry) highlight the diversification benefits of a mixed portfolio.

  4. Normalization and Comparison: Normalizing the stock prices allowed us to compare the relative performance of each stock from a common starting point. This showed that, while all stocks had periods of growth, the magnitude of growth varied significantly. Tesla (TSLA), for example, showed remarkable growth compared to other stocks, reflecting its market disruption and investor enthusiasm.

  5. Daily Returns Distribution: The histograms of daily returns demonstrated that stock returns generally follow a normal distribution with varying degrees of volatility. Stocks like TSLA and AMZN had wider distributions, indicating higher volatility, while others like T (AT&T) and IBM had narrower distributions, reflecting more stable performance.

Overall, the analysis underscores the importance of understanding both individual stock behaviors and their interactions within a portfolio. By leveraging data visualization, we can uncover trends, identify periods of volatility, and make more informed investment decisions.