An introduction to YouTube Data API and Etags.
This article will showcase what information we can obtain from YouTube Data API. To use the API, one will need to create a Google Cloud Project and create a credentials. It can be an API key, a service account, an OAuth client id. Please refer to the previous article for further information about the authentication and authorisation for Google APIs.
In another previous article
we have sample a few hundreds of channel_id
from the main page of YouTube.
This article will start from these channel_id
and retrieve their videos’
information.
This article will also talk about the etag
and the if-none-match
http
header. One can utilise this header to save time and bandwidth by caching
previous response.
There is a full documentation provided by YouTube about this API. In short, one can utilise this API for things that can be normally executed on the YouTube website. For example one can view statistics about videos and channels, view comments of videos, explore items in playlists, perform search query, etc. This API also allows us to insert comments, captions, etc, as long as an authorised token is provided.
As described previously, this article will start from some channel_id
’s to
videos published by them. It will cover 3 different endpoints for this purpose:
We will obtain, from the List Channels endpoint, a playlist_id
that contains
all the videos published by the channel. Then we will find all the items
(videos) inside this playlist from the List PlaylistItems endpoint to obtain
the video_id
of the videos. Finally we can get all the information about the
videos from the List Videos endpoint.
Please be aware that there is a quota limit for each of the Google Cloud Project on each day. Each project will have 10,000 units of quota one each day and different methods require different number of units.
When you have your access token ready (see previous article), the following script gives an example request to the API.
import os
import requests
# Headers for GCP OAuth2 Token
= {"Authorization": f'Bearer {os.getenv("GCP_OAUTH_TOKEN")}'}
headers
# Headers for GCP Service Account
= {"Authorization": f'Bearer {os.getenv("GCP_SERVICE_ACC_TOKEN")}'}
headers
# Headers for GCP API Key
= {"x-goog-api-key": os.getenv("GCP_API_KEY")}
headers
# Request
= requests.request(
response =method, # GET or POST or other HTTP methods
method=url, # Endpoint of service
url=params, # Query or parameters (if needed)
params=body, # Request body (if needed)
json=headers # The headers specified above
headers )
In this section, we will talk about the List Channel endpoint. From the
previous article,
we obtained hundreds of channel_id
’s.
= [
channel_ids "UCx1m6AboILQKMNXbsLjjI4Q",
"UC-iICi3q1AF_9WI--DVacTQ"
]
We will pass these channel ids to the endpoint
GET https://www.googleapis.com/youtube/v3/channels
We are also required which part(s) of the channel resource we want to request.
= [
parts "brandingSettings", "contentDetails", "contentOwnerDetails", "id",
"localizations", "snippet", "statistics", "status", "topicDetails"
]
Now we are ready to make the request. Notice that you can only request for 50 channels’ information in 1 request.
= {
params "part": ",".join(parts),
"id": ",".join(channel_ids) # Can put 50 channel ids in 1 request
}
= requests.get(
response ="https://www.googleapis.com/youtube/v3/channels",
url=params,
params=headers # See "Before Requests" session
headers
)
response.json()
# {
# "kind": "youtube#channelListResponse",
# "etag": "UVHnd3n70mtZnwVUmK2mRePbVCo",
# "pageInfo": {
# "totalResults": 2,
# "resultsPerPage": 5
# },
# "items": [
# {
# "kind": "youtube#channel",
# "etag": "KGqojbKzM5T5vGED4lXuDv1kPr4",
# "id": "UCU0zC0L4o0qOos83NLhK5ug",
# "snippet": {
# "title": "Ginger Cat",
# "description": "Welcome to Ginger Cat, where for everyone who loves animal and care about their wellbeing. Our goal is to make caring about animals a viral cause. I hope you guys have a good time in here!\n\n",
# "customUrl": "@gingercatofficial",
# "publishedAt": "2016-11-27T04:02:23Z",
# "thumbnails": {...},
# "localized": {...},
# "country": "US"
# },
# "contentDetails": {
# "relatedPlaylists": {
# "likes": "",
# "uploads": "UUU0zC0L4o0qOos83NLhK5ug"
# }
# },
# "statistics": {
# "viewCount": "155315923",
# "subscriberCount": "347000",
# "hiddenSubscriberCount": false,
# "videoCount": "516"
# },
# "topicDetails": {...},
# "status": {...},
# "brandingSettings": {...},
# "contentOwnerDetails": {}
# },
# {
# "kind": "youtube#channel",
# "etag": "4_wvn4zWGtuxuWYN4ddla40T5S8",
# "id": "UCx1m6AboILQKMNXbsLjjI4Q",
# "snippet": {
# "title": "Proper DIY",
# "description": "After many years of 'Doing It Myself' I now invite you the viewer into my world of DIY projects around the house and garden. I hope to inspire everyone to give It a go and to use your new knowledge and skill to save money while having fun and a sense of achievement.\n\nThe channel was launched on 1st Jan 2021 and there will be weekly videos posted throughout 2021 on everything from changing a plug to building and electrical works. Please subscribe to ensure you follow me through the Youtube journey!\n",
# "customUrl": "@properdiy",
# "publishedAt": "2018-12-03T07:05:09Z",
# "thumbnails": {...},
# "localized": {...},
# "country": "GB"
# },
# "contentDetails": {
# "relatedPlaylists": {
# "likes": "",
# "uploads": "UUx1m6AboILQKMNXbsLjjI4Q"
# }
# },
# "statistics": {
# "viewCount": "22997422",
# "subscriberCount": "246000",
# "hiddenSubscriberCount": false,
# "videoCount": "155"
# },
# "topicDetails": {...},
# "status": {...},
# "brandingSettings": {...},
# "contentOwnerDetails": {}
# }
# ]
# }
From the above step, we can obtain the playlists containing all the videos published by the channels.
= [
playlist_ids "contentDetails"]["relatedPlaylists"]["uploads"] for x in response.json()["items"]
x[
]
playlist_ids
# ['UUU0zC0L4o0qOos83NLhK5ug', 'UUx1m6AboILQKMNXbsLjjI4Q]
Now we can list all items inside a playlist. Notice that we can only list the
items in a single playlist in each request. We also need to set the number of
maximum results to be returned by the service. The maximum maxResult
is 50.
Yet, there are always more than 50 videos in a channel. We need to loop through
the pages via the nextPageToken
key.
The function below will loop through all the videos in a playlist.
def fetch_all_videos_in_playlist(playlist_id):
= [
parts "contentDetails", "id", "snippet", "status"
]
= ""
page_token = []
items
while page_token is not None:
= {
params "part": ",".join(parts),
"playlistId": playlist_id,
"maxResults": 50,
"pageToken": page_token or None
}
= requests.get(
res ="https://www.googleapis.com/youtube/v3/playlistItems",
url=params,
params=headers
headers
)
+= res.json()["items"]
items = res.json().get("nextPageToken")
page_token
return items
We now try out the function.
# Fetch all the videos in the first channel
= fetch_all_videos_in_playlist(playlist_ids[0])
playlist_items
2]
playlist_items[:
# [
# {
# "kind": "youtube#playlistItem",
# "etag": "PbvtsMi9EMMPh7IvYd-Rln4rp-A",
# "id": "VVVVMHpDMEw0bzBxT29zODNOTGhLNXVnLkUwVXBKZkJ6X2Ew",
# "snippet": {
# "publishedAt": "2023-08-14T19:00:05Z",
# "channelId": "UCU0zC0L4o0qOos83NLhK5ug",
# "title": "Unforgettable Comedy Duo: Funniest Dog and Human Video Ever! \ud83d\ude31",
# "description": "Unforgettable Comedy Duo: Funniest Dog and Human Video Ever! \ud83d\ude31\nGet ready to experience the unbreakable bond between humans and their furry best friends, as their mischievous antics and genuine love bring out the purest emotions. Don't miss these videos that will brighten your day!\n----------------------------------------------------------------------------\nBest of the WEEK! Funniest Cats And Dogs Videos \ud83d\ude02\nhttps://www.youtube.com/watch?v=knkv4Habj9s&list=PLDae3oPkWscWqZZtu8gqj20Ug9ijrr7Ae&index=1&t=2s&pp=iAQB \n----------------------------------------------------------------------------\n\n#dogs #pets #animal #dog #dogandhuman #animal2023",
# "thumbnails": {...},
# "channelTitle": "Ginger Cat",
# "playlistId": "UUU0zC0L4o0qOos83NLhK5ug",
# "position": 0,
# "resourceId": {
# "kind": "youtube#video",
# "videoId": "E0UpJfBz_a0"
# },
# "videoOwnerChannelTitle": "Ginger Cat",
# "videoOwnerChannelId": "UCU0zC0L4o0qOos83NLhK5ug"
# },
# "contentDetails": {
# "videoId": "E0UpJfBz_a0",
# "videoPublishedAt": "2023-08-14T19:00:05Z"
# },
# "status": {
# "privacyStatus": "public"
# }
# },
# {
# "kind": "youtube#playlistItem",
# "etag": "EwGgaznGyPJpoc6T-ERkuKE1nt8",
# "id": "VVVVMHpDMEw0bzBxT29zODNOTGhLNXVnLmtua3Y0SGFiajlz",
# "snippet": {
# "publishedAt": "2023-08-12T19:00:16Z",
# "channelId": "UCU0zC0L4o0qOos83NLhK5ug",
# "title": "Best of the WEEK! Funniest Cats And Dogs Videos \ud83d\ude02",
# "description": "Best of the WEEK! Funniest Cats And Dogs Videos \ud83d\ude02\nBeing an animal lover, these were cute!\n----------------------------------------------------------------------------\nDogs can talk, dogs can understand. Who knew?\ud83d\ude31 \nhttps://www.youtube.com/watch?v=D6aecAswvyE&list=PLDae3oPkWscWqZZtu8gqj20Ug9ijrr7Ae&index=1&pp=iAQB \n----------------------------------------------------------------------------\n\n#dogs #pets #animal #dog #animal #animal2023",
# "thumbnails": {...},
# "channelTitle": "Ginger Cat",
# "playlistId": "UUU0zC0L4o0qOos83NLhK5ug",
# "position": 1,
# "resourceId": {
# "kind": "youtube#video",
# "videoId": "knkv4Habj9s"
# },
# "videoOwnerChannelTitle": "Ginger Cat",
# "videoOwnerChannelId": "UCU0zC0L4o0qOos83NLhK5ug"
# },
# "contentDetails": {
# "videoId": "knkv4Habj9s",
# "videoPublishedAt": "2023-08-12T19:00:16Z"
# },
# "status": {
# "privacyStatus": "public"
# }
# }
# ]
Notice that the List PlaylistItems method can only return 20,000 items in one single playlist if you are using API key.
Now we have a list of videos published by the channel UUU0zC0L4o0qOos83NLhK5ug
.
= [x["snippet"]["resourceId"]["videoId"] for x in playlist_items]
video_ids
10]
video_ids[:
# [
# 'E0UpJfBz_a0', 'knkv4Habj9s', 'D6aecAswvyE', 'uGdDiiro8yg', 'pwNqOkkdry4',
# 'ylp36nsXpfE', 'VtOY_2-YBSM', '27VRjJfW3A8', '3BGQGzpeOkU', '8qaV3s8kzcU'
# ]
We can use the List Videos method to extract information about the videos
(likeCount
, viewCount
, commentCount
, title
, description
, etc.). Like
the List Channels method, you can only request for 50 videos in 1 request.
The function below handles this issue.
from math import ceil
def fetch_videos(video_ids):
= 50
n = ceil(len(video_ids) / n)
batches = [
ids * i):(n * (i + 1))] for i in range(batches)
video_ids[(n
]
= [
parts "contentDetails", "id", "liveStreamingDetails",
"localizations", "player", "recordingDetails",
"snippet", "statistics", "status", "topicDetails"
]
= []
items for id in ids:
= {
params "part": ",".join(parts),
"id": ",".join(id)
}= requests.get(
response ="https://www.googleapis.com/youtube/v3/videos",
url=params,
params=headers
headers
)+= response.json()["items"]
items
return items
Now we are ready for the API request.
= fetch_videos(video_ids)
videos
2]
videos[:
# [
# {
# "kind": "youtube#video",
# "etag": "h7TfUpEP8ts-XNzYvActIph1mg4",
# "id": "E0UpJfBz_a0",
# "snippet": {
# "publishedAt": "2023-08-14T19:00:05Z",
# "channelId": "UCU0zC0L4o0qOos83NLhK5ug",
# "title": "Unforgettable Comedy Duo: Funniest Dog and Human Video Ever! \ud83d\ude31",
# "description": "Unforgettable Comedy Duo: Funniest Dog and Human Video Ever! \ud83d\ude31\nGet ready to experience the unbreakable bond between humans and their furry best friends, as their mischievous antics and genuine love bring out the purest emotions. Don't miss these videos that will brighten your day!\n----------------------------------------------------------------------------\nBest of the WEEK! Funniest Cats And Dogs Videos \ud83d\ude02\nhttps://www.youtube.com/watch?v=knkv4Habj9s&list=PLDae3oPkWscWqZZtu8gqj20Ug9ijrr7Ae&index=1&t=2s&pp=iAQB \n----------------------------------------------------------------------------\n\n#dogs #pets #animal #dog #dogandhuman #animal2023",
# "thumbnails": {...},
# "channelTitle": "Ginger Cat",
# "tags": [...],
# "categoryId": "24",
# "liveBroadcastContent": "none",
# "defaultLanguage": "en",
# "localized": {...},
# "defaultAudioLanguage": "en"
# },
# "contentDetails": {
# "duration": "PT8M56S",
# "dimension": "2d",
# "definition": "hd",
# "caption": "false",
# "licensedContent": true,
# "contentRating": {},
# "projection": "rectangular"
# },
# "status": {
# "uploadStatus": "processed",
# "privacyStatus": "public",
# "license": "youtube",
# "embeddable": true,
# "publicStatsViewable": true,
# "madeForKids": false
# },
# "statistics": {
# "viewCount": "34275",
# "likeCount": "749",
# "favoriteCount": "0",
# "commentCount": "28"
# },
# "player": {...},
# "topicDetails": {...},
# "recordingDetails": {},
# "localizations": {...}
# },
# {
# "kind": "youtube#video",
# "etag": "hFSA6Fj9y8Hx-zAKzavsM9vR8Kk",
# "id": "knkv4Habj9s",
# "snippet": {
# "publishedAt": "2023-08-12T19:00:16Z",
# "channelId": "UCU0zC0L4o0qOos83NLhK5ug",
# "title": "Best of the WEEK! Funniest Cats And Dogs Videos \ud83d\ude02",
# "description": "Best of the WEEK! Funniest Cats And Dogs Videos \ud83d\ude02\nBeing an animal lover, these were cute!\n----------------------------------------------------------------------------\nDogs can talk, dogs can understand. Who knew?\ud83d\ude31 \nhttps://www.youtube.com/watch?v=D6aecAswvyE&list=PLDae3oPkWscWqZZtu8gqj20Ug9ijrr7Ae&index=1&pp=iAQB \n----------------------------------------------------------------------------\n\n#dogs #pets #animal #dog #animal #animal2023",
# "thumbnails": {...},
# "channelTitle": "Ginger Cat",
# "tags": [...],
# "categoryId": "24",
# "liveBroadcastContent": "none",
# "defaultLanguage": "en",
# "localized": {...},
# "defaultAudioLanguage": "en"
# },
# "contentDetails": {
# "duration": "PT10M19S",
# "dimension": "2d",
# "definition": "hd",
# "caption": "false",
# "licensedContent": true,
# "contentRating": {},
# "projection": "rectangular"
# },
# "status": {
# "uploadStatus": "processed",
# "privacyStatus": "public",
# "license": "youtube",
# "embeddable": true,
# "publicStatsViewable": true,
# "madeForKids": false
# },
# "statistics": {
# "viewCount": "37920",
# "likeCount": "588",
# "favoriteCount": "0",
# "commentCount": "25"
# },
# "player": {...},
# "topicDetails": {...},
# "recordingDetails": {},
# "localizations": {...}
# }
# ]
Every time you request a resource from these methods, an etag
will be attached
within the response. One of the main use of this tag is to keep track of the
resource. If the resource has not changed since your last request, the etag
will stay the same.
One can pass the etag
along with the request in the if-none-match
header
and if the resource’s etag
match with the one you provided, the server will
return HTTP status 304, indicates that you can use the previous response.
Otherwise, it will return the updated resources.
= "UCSXwxpWZQ7XZ1WL3wqevChA"
channel_id
= requests.get(
response ="https://www.googleapis.com/youtube/v3/channels",
url={"id":channel_id},
params=headers
headers
)
"if-none-match"] = response.json()["etag"]
headers[
= requests.get(
response ="https://www.googleapis.com/youtube/v3/channels",
url={"id":channel_id},
params=headers
headers
)
response.status_code
# 304