Step-by-Step Guide Get Any US Stock Price Data with Python and Alpaca Markets
The US Stock Market is the largest stock market in the world. Worth $50.8 trillion dollars on the 1st January 2024, it remains one of the fastest growing wealth creation mechanisms in world.
The size and complexity of the market is enormous. There are multiple exchanges (i.e. NASDAQ, NYSE) and over 10,000 different companies you could analyze.
Analyzing a market of this size is a perfect opportunity to use automation. In no time at all (or maybe 30 minutes), I’ll show you how to retrieve the price data for any stock or combination of stocks on the market — a perfect setup for the rest of this series.
So lets jump in.
About this Episode
In this episode, I’ll show you everything you need to retrieve historical market data for any US based stock. You’ll be using a combination of:
- Market data from Alpaca Markets
- Python code
What You’ll Need to Complete the Episode
The best way to complete this episode is to have a dev environment set up and ready to go. I’ll be using the dev environment I created in a previous episode.
Level Up Your Learning Experience
I’m passionate about effective learning environments, so this episode includes some wonderful learning resources you can access.
- Completed code. If you’re someone who likes to view the solution while developing, then check out the completed code, freely available, on my GitHub.
- Helpful help. If you get stuck and need some helpful help, then jump into our Discord channel and ask questions. Our community loves to help! Note that you’ll need to sign up to access this (although it won’t cost you anything) and it’s for help with my blog content only 😁
- Video content. If you prefer video content to written content, then head over to my YouTube channel to view my content there. At the time of writing, the video content for this episode is scheduled to go live within the next week.
- Advanced content. Some of my more advanced content can only be accessed through the TradeOxy blog platform. This allows me to monetize my work and hence continue providing powerful solutions. Sign up here if that’s of interest to you.
Legal Stuff
DYOR. Note that all trading is at your own risk. My goal is to provide you with the self-developed methods, systems, and tools I use — it is up to you to figure out if this solution works for you AND if I’ve provided credible content. Always DYOR (Do Your Own Research)
Referrals. I receive no commissions for any of the products I mention in this blog. They’re all free (or have a free tier), and I simply provide links to simplify your learning experience.
AI Use. No AI was harmed in the creation of this blog. Some of the images are partially generated or enhanced through AI tools, we always use a real human to put them together. I do not use AI to generate the text, just spell check.
Retrieve Stock Market Data from Alpaca Markets
Step 1: Sign Up for Alpaca Markets
Alpaca Markets advertises themselves as an API driven platform for trading the stock market. As a long-time user of the platform, I’m always impressed with the ease of use of the platform, the power of their API, and the ongoing addition of various markets to the platform.
Their documentation is pretty good, and you can trade Stocks, Crypto, and Forex.
To sign up for Alpaca Markets, follow this link.
Step 2: Set Up Your Trading Bot
The purpose of this series to build a trading bot. Therefore, we want to make our code as simple as possible.
One of the best ways to do this is to separate out the various pieces of functionality required. That way, we can refer back to them as and when we need to. It also allows us to comply with a powerful development principle called DRY — Don’t Repeat Yourself.
To do this, head to your trading bot dev environment and create a file called alpaca_interactions.py
. This file will be where we handle any interactions with the Alpaca Markets API.
Here’s what your file system should look like with this file added:
Step 3: Get Your Alpaca Markets Authentication Keys
Authentication to Alpaca Markets is handled through the use of an API Key and an API Secret. You receive a different key pair for each Alpaca Markets account you use.
In this series, we’ll be using Paper Trading for all the examples. This ensures we don’t lose any money while we’re experimenting.
To get your keys, follow these steps:
- Log in to your Alpaca account
- Choose the Paper option
- Find the button “View API Keys”
- Press the button
- Generate your keys
Note. Make sure you record your API secret key immediately (and in a safe place!), as Alpaca Markets keeps no record of the secret key if you lose it.
Step 4: Add the Keys to Your Trading Bot
As with all things related to secrets in coding, it’s super critical to keep your keys secure. Losing them can be an absolute pain, and if they get stolen and misused…well, I’m sure you can imagine the hassle!
Fortunately, the GitHub Codespace we’re using for our dev environment has some neat secret storage features freely available to us. They’re called GitHub Encrypted Secrets, and you can read more about them here.
For our purposes, this will allow us to store the secrets in a way that’s both really secure AND easy for you (and only you) to access.
Here’s how to add them:
- Go to your GitHub repo (where your Codespace is located)
- Click on your profile picture in the top right
- Navigate to your Settings
- Navigate to the Codespaces section
- Select “New Secret”
- Add the Alpaca API Key, calling it “ALPACA_API” (CAPS included)
- Add the API Secret Key, calling it “ALPACA_SECRET_API” (CAPS included)
At this point, your Codespace should notify you that a new secred has been added. Choose the option to reload your environment.
Step 5: Securely Import Your Alpaca Keys to Your Trading Bot
It’s time to get some stock data!
We’ll start by enabling your trading bot to access your API Key and API Secret Key you just created. Add the following code to your alpaca_interactions.py
file:
import os
import requests
# Set the Alpaca.Markets API Key
API_KEY = os.getenv('ALPACA_API')
API_SECRET = os.getenv('ALPACA_SECRET_API')
This code imports the OS library to your trading bot, then sets two variables, one for each of your keys.
The requests
library is used later.
Step 6: Set Up Market Data Retrieval from Alpaca Markets
In this trading bot, we’ll be using the raw HTTP endpoint for Alpaca Markets, rather than transiting through their Python SDK. The reason for this is that the Python SDK is almost impossibly complex and not very well documented. In contrast, the raw HTTP endpoint is much simpler to use and is well documented. Over the years, this has been a much more effective API for me.
Here’s what you need to do:
- Query the right URL endpoint from Alpaca Markets
- Authenticate to the endpoint
- Retrieve data
- Handle any errors gracefully (i.e. actually tell you what went wrong)
- Convert the responses to JSON
- Return the result
To do this, add the code below to your alpaca_interactions.py
file:
# Base function for querying the Alpaca.Markets API
def query_alpaca_api(url: str, params: dict) -> dict:
"""
Base function for querying the Alpaca.Markets API
:param url: The URL to query
:param params: The parameters to pass to the API
"""
# Check that the API Key and Secret are not None
if API_KEY is None:
raise ValueError("The API Key is not set.")
if API_SECRET is None:
raise ValueError("The API Secret is not set.")
# Set the header information
headers = {
'accept': 'application/json',
'APCA-API-KEY-ID': API_KEY,
'APCA-API-SECRET-KEY': API_SECRET
}
try:
# Get the response from the API endpoint
response = requests.get(url, headers=headers, params=params)
except Exception as exception:
print(f"An exception occurred when querying the URL {url} with the parameters {params}: {exception}")
raise exception
# Get the response code
response_code = response.status_code
# If the response code is 403, print that the API key and or secret are incorrect
if response_code == 403:
print("The API key and or secret are incorrect.")
raise ValueError("The API key and or secret are incorrect.")
# Convert the response to JSON
json_response = response.json()
# Return the JSON response
return json_response
Finally, head to your requirements.txt
and add a line called requests
to the bottom.
Nice work!
Format Your Stock Market Data for Future Trading Bot Greatness!
Okies. Right now we can successfully retrieve historical pricing data from Alpaca Markets.
However, if you were to look at the incoming data, you’d quickly find it’s not very easy to read. Furthermore, there’s a few ways we can modify our data so that it’s much easier to convert into strategies further on.
Hello Pandas!
To do this, we’ll be leveraging a famous Python Library called Pandas. This library is one of the truly great Python libraries, and in truly Python style, it’s completely free.
Python is pretty amazing. It’s extremely fast, very flexible, and used throughout the data analysis world. I’m yet to find something data related that I can’t interface with.
Step 1: Add Pandas to Your Trading Bot
First things first, we’ll need to import the Pandas library to our trading bot. To do this, return to your requirements.txt
file and add the Pandas library.
At this point, your requirements.txt
file should look like this:
requests
pandas
Now, reload your dev environment by running the command pip install -r requirements.txt
in your terminal (or bash).
Step 2: Update Alpaca Interactions File to Retrieve Data
As I mentioned before, all interaction with Alpaca Markets is through our alpaca_interactions.py
file. Therefore, we need to update this file to include the API query functionality.
To do this, start by adding these two lines to the TOP of your alpaca_interactions.py
file:
import pandas
import datetime
Next, add the below code to the BOTTOM of this file:
# Function to retrieve historical candlestick data from Alpaca.Markets
def get_historic_bars(symbols: list, timeframe: str, limit: int, start_date: datetime, end_date: datetime) -> pandas.DataFrame:
"""
Function to retrieve historical candlestick data from Alpaca.Markets
:param symbols: The symbols to retrieve the historical data for
:param timeframe: The timeframe to retrieve the historical data for
:param limit: The number of bars to retrieve
:param start_date: The start date for the historical data
:param end_date: The end date for the historical data
"""
# Check that the start_date and end_date are datetime objects
if not isinstance(start_date, datetime.datetime):
raise ValueError("The start_date must be a datetime object.")
if not isinstance(end_date, datetime.datetime):
raise ValueError("The end_date must be a datetime object.")
# Check that the end date is not in the future
if end_date > datetime.datetime.now():
print("The end date is in the future. Setting the end date to now.")
end_date = datetime.datetime.now()
# Check that the start date is not after the end date
if start_date > end_date:
raise ValueError("The start date cannot be after the end date.")
# Convert the symbols list to a comma-separated string
symbols_joined = ",".join(symbols)
# Set the start and end dates to the correct format - they should only include days
start_date = start_date.strftime("%Y-%m-%d")
end_date = end_date.strftime("%Y-%m-%d")
# Create the params dictionary
params = {
"symbols": symbols_joined,
"timeframe": timeframe,
"limit": limit,
"start": start_date,
"end": end_date,
"adjustment": "raw",
"feed": "sip",
"sort": "asc"
}
# Set the API endpoint
url = f"https://data.alpaca.markets/v2/stocks/bars"
# Send to the base function to query the API
try:
json_response = query_alpaca_api(url, params)
except Exception as exception:
print(f"An exception occurred in the function get_historic_bars() with the parameters {params}: {exception}")
raise exception
# Extract the bars from the JSON response
json_response = json_response["bars"]
# Create an empty parent dataframe
bars_df = pandas.DataFrame()
# Iterate through the symbols list
for symbol in symbols:
# Extract the bars for the symbol
symbol_bars = json_response[symbol]
# Convert the bars to a dataframe
symbol_bars_df = pandas.DataFrame(symbol_bars)
# Add the symbol column
symbol_bars_df["symbol"] = symbol
# Modify the following column names to be more descriptive:
# o -> candle_open
# h -> candle_high
# l -> candle_low
# c -> candle_close
# v -> candle_volume
# t -> candle_timestamp
# vw -> vwap
# Rename the columns
symbol_bars_df = symbol_bars_df.rename(
columns={
"o": "candle_open",
"h": "candle_high",
"l": "candle_low",
"c": "candle_close",
"v": "candle_volume",
"t": "candle_timestamp",
"vw": "vwap"
}
)
# Add the symbol bars to the parent dataframe
bars_df = pandas.concat([bars_df, symbol_bars_df])
# Return the historical bars
return bars_df
This code is pretty powerful, and it does a lot. Here’s an overview:
- Retrieves the stock(s) and timeframe you specify, along with parameters such as the number of candlesticks you want, and a date range to query.
- Checks your inputs (and tells you when they’re wrong)
- Does a series of conversions to match the format the Alpaca Markets endpoint requires
- Converts the retrieved data into a Pandas Dataframe
- Returns the dataframe to you
Note. As a side note, the data format we convert this API into is a data format that is consistent across my entire trading bot series. So even if you look at my series about Polygon, Binance, and so on, you’ll find they’re exactly the same!
Let’s Actually Run the Code!
Our trading bot is now ready to retrieve historic candlestick data from Alpaca Markets!
I’ll show you how.
Step 1: App.py
Depending on if you’ve completed some of my other content, you may or may not have already have an app.py
setup in your dev environment.
If you don’t go ahead and create that now.
Step 2: Update App.py
Add the following code to app.py
import alpaca_interactions as alpaca
import datetime
# List of symbols
symbols = ["AAPL"]
max_number_of_candles = 100
timeframe = "1day"
# Function to run the trading bot
def auto_run_trading_bot():
"""
Function to run the trading bot
"""
# Print Welcome to your very own trading bot
print("Welcome to your very own trading bot")
# Set the end date to yesterday
end_date = datetime.datetime.now() - datetime.timedelta(days=1) # Note that if you have a premium subscription you can remove this restriction
# Set the start date to one year ago
start_date = end_date - datetime.timedelta(days=365)
# Get the historical data
for symbol in symbols:
# Convert symbol to a list
symbol = [symbol]
# Get the historical data
historical_data = alpaca.get_historic_bars(
symbols=symbol,
timeframe=timeframe,
start_date=start_date,
end_date=end_date,
limit=max_number_of_candles
)
# Print the historical data
print(historical_data) # <- this can be removed if you don't want to see the progression
# Main function for program
if __name__ == "__main__":
auto_run_trading_bot()
Here, we’ve performed the following steps:
- Imported our
alpaca_interactions.py
library that we’ve been building throughout this episode - Defined a symbol (AAPL), timeframe, and number of candlesticks (klines) to retrieve from Alpaca Markets
- Printed a message to the terminal so we know it’s started
- Retrieved the data
- Printed the data to the screen
If you go to your terminal and run app.py
you should be some results.
Here’s what I got (note your actual AAPL data will be different as you’re running it at a different day than I am):
Step 3: Scale Up Your Data Retrieval
To demonstrate the power of an algorithmic trading bot, let’s update some of our input parameters.
Symbols. For instance, let’s say you wanted to retrieve the daily data for the Facebook (now known as META), Apple, Amazon, Netflix, and Google (now known as Alphabet) tickers. This is known as the original FAANG group of companies.
To do this, all you need to do is alter your symbols
variable to look like this:
symbols = ["AAPL", "GOOGL", "META", "NFLX", "AMZN"]
Timeframe. To retrieve different timeframes, update your timeframe
variable. For instance:
timeframe = '30min'
More candlesticks. Update your max_number_of_candles
variable. For instance:
max_number_of_candles = 1000
With a minimum amount of work, you can drastically increase the power of your trading bot!
Next Steps
You’ve got everything you need to retrieve data for any stock on the US Stock Markets. This set’s the foundation for building an incredibly powerful stock market trading bot.
Follow my blog to see the various types of trading bots you can design and build, such as:
- Adding powerful technical indicators
- Retrieving fundamental data such as earnings per share
- Retrieving sentiment analysis data so you can see what the markets are thinking / feeling
- Adding AI analysis to your trading bot
- Automation all your trading bot functions
- And more
Say Hi!
I love hearing from my readers, so feel free to reach out. It means a ton to me when you clap for my articles or drop a friendly comment — it helps me know that my content is helping.
❤
Complete Code for alpaca_interactions.py
import os
import requests
import pandas
import datetime
# Set the Alpaca.Markets API Key
API_KEY = os.getenv('ALPACA_API')
API_SECRET = os.getenv('ALPACA_SECRET_API')
# Base function for querying the Alpaca.Markets API
def query_alpaca_api(url: str, params: dict) -> dict:
"""
Base function for querying the Alpaca.Markets API
:param url: The URL to query
:param params: The parameters to pass to the API
"""
# Check that the API Key and Secret are not None
if API_KEY is None:
raise ValueError("The API Key is not set.")
if API_SECRET is None:
raise ValueError("The API Secret is not set.")
# Set the header information
headers = {
'accept': 'application/json',
'APCA-API-KEY-ID': API_KEY,
'APCA-API-SECRET-KEY': API_SECRET
}
try:
# Get the response from the API endpoint
response = requests.get(url, headers=headers, params=params)
except Exception as exception:
print(f"An exception occurred when querying the URL {url} with the parameters {params}: {exception}")
raise exception
# Get the response code
response_code = response.status_code
# If the response code is 403, print that the API key and or secret are incorrect
if response_code == 403:
print("The API key and or secret are incorrect.")
raise ValueError("The API key and or secret are incorrect.")
# Convert the response to JSON
json_response = response.json()
# Return the JSON response
return json_response
# Function to retrieve historical candlestick data from Alpaca.Markets
def get_historic_bars(symbols: list, timeframe: str, limit: int, start_date: datetime, end_date: datetime) -> pandas.DataFrame:
"""
Function to retrieve historical candlestick data from Alpaca.Markets
:param symbols: The symbols to retrieve the historical data for
:param timeframe: The timeframe to retrieve the historical data for
:param limit: The number of bars to retrieve
:param start_date: The start date for the historical data
:param end_date: The end date for the historical data
"""
# Check that the start_date and end_date are datetime objects
if not isinstance(start_date, datetime.datetime):
raise ValueError("The start_date must be a datetime object.")
if not isinstance(end_date, datetime.datetime):
raise ValueError("The end_date must be a datetime object.")
# Check that the end date is not in the future
if end_date > datetime.datetime.now():
print("The end date is in the future. Setting the end date to now.")
end_date = datetime.datetime.now()
# Check that the start date is not after the end date
if start_date > end_date:
raise ValueError("The start date cannot be after the end date.")
# Convert the symbols list to a comma-separated string
symbols_joined = ",".join(symbols)
# Set the start and end dates to the correct format - they should only include days
start_date = start_date.strftime("%Y-%m-%d")
end_date = end_date.strftime("%Y-%m-%d")
# Create the params dictionary
params = {
"symbols": symbols_joined,
"timeframe": timeframe,
"limit": limit,
"start": start_date,
"end": end_date,
"adjustment": "raw",
"feed": "sip",
"sort": "asc"
}
# Set the API endpoint
url = f"https://data.alpaca.markets/v2/stocks/bars"
# Send to the base function to query the API
try:
json_response = query_alpaca_api(url, params)
except Exception as exception:
print(f"An exception occurred in the function get_historic_bars() with the parameters {params}: {exception}")
raise exception
# Extract the bars from the JSON response
json_response = json_response["bars"]
# Create an empty parent dataframe
bars_df = pandas.DataFrame()
# Iterate through the symbols list
for symbol in symbols:
# Extract the bars for the symbol
symbol_bars = json_response[symbol]
# Convert the bars to a dataframe
symbol_bars_df = pandas.DataFrame(symbol_bars)
# Add the symbol column
symbol_bars_df["symbol"] = symbol
# Modify the following column names to be more descriptive:
# o -> candle_open
# h -> candle_high
# l -> candle_low
# c -> candle_close
# v -> candle_volume
# t -> candle_timestamp
# vw -> vwap
# Rename the columns
symbol_bars_df = symbol_bars_df.rename(
columns={
"o": "candle_open",
"h": "candle_high",
"l": "candle_low",
"c": "candle_close",
"v": "candle_volume",
"t": "candle_timestamp",
"vw": "vwap"
}
)
# Add the symbol bars to the parent dataframe
bars_df = pandas.concat([bars_df, symbol_bars_df])
# Return the historical bars
return bars_df