This example pulls the top 10 pages for the last thirty days, for visits that occurred on a mobile device. We’ll doing this by referencing the standard Mobile Segment in Google Analytics, but the process is exactly the same for referencing a custom segment that you built. The example explains that in the comments in the code. This returns the exact same results as these two examples, but through different means for defining/referencing the segment:
All three approaches are perfectly acceptable. This example, though, relies on the segment being available through the Google Analytics web interface – as a standard segment or a custom segment.
Be sure you’ve completed the steps on the Initial Setup page before running this code.
For the setup, we’re going to load a few libraries, load our specific Google Analytics credentials, and then authorize with Google.
# Load the necessary libraries. These libraries aren't all necessarily required for every
# example, but, for simplicity's sake, we're going ahead and including them in every example.
# The "typical" way to load these is simply with "library([package name])." But, the handy
# thing about using the approach below -- which uses the pacman package -- is that it will
# check that each package exists and actually install any that are missing before loading
# the package.
if (!require("pacman")) install.packages("pacman")
pacman::p_load(googleAnalyticsR, # How we actually get the Google Analytics data
tidyverse, # Includes dplyr, ggplot2, and others; very key!
devtools, # Generally handy
googleVis, # Useful for some of the visualizations
scales) # Useful for some number formatting in the visualizations
# Authorize GA. Depending on if you've done this already and a .ga-httr-oauth file has
# been saved or not, this may pop you over to a browser to authenticate.
ga_auth(token = ".ga-httr-oauth")
# Set the view ID and the date range. If you want to, you can swap out the Sys.getenv()
# call and just replace that with a hardcoded value for the view ID. And, the start
# and end date are currently set to choose the last 30 days, but those can be
# hardcoded as well.
view_id <- Sys.getenv("GA_VIEW_ID")
start_date <- Sys.Date() - 31 # 30 days back from yesterday
end_date <- Sys.Date() - 1 # Yesterday
If that all runs with just some messages but no errors, then you’re set for the next chunk of code: pulling the data.
The key step here is figuring out the ID of the segment that you want to use. One option – once you’ve loaded googleAnalyticsR
and tidyverse
and run ga_auth()
– is to create a data frame with all your segments in it. Simply dropping this command in the console will create a segs
data frame that will list all of them. That can be a mighty long list:
segs <- as.data.frame(ga_segment_list()) %>% select(items.segmentId, items.name, items.type)
Alternatively, you can just hop over to the Google Analytics Query Explorer, choose a segment in the segment field, and then copy the value (you have to have the Show segment definitions instead of IDs checkbox unchecked to get the ID).
# Create the segment object. See ?segment_ga4() for details. Note that the name -- the first
# argument -- is actually moot here. When you use the segment_id argument, what will actually
# get used as the name of the segment is, well, the name of the segment in the web interface.
# To illustrate, even though I've put "Mobile Sessions Only" in the name argument for this
# function, the output below actually uses the name "Mobile Traffic."
my_segment <- segment_ga4("Mobile Sessions Only",
segment_id = "gaid::-14")
# Pull the data. See ?google_analytics_4() for additional parameters. Depending on what
# you're expecting back, you probably would want to use an "order" argument to get the
# results in descending order. But, we're keeping this example simple. Note, though, that
# we're still wrapping my_segment in a list() (of one element).
ga_data <- google_analytics(viewId = view_id,
date_range = c(start_date, end_date),
metrics = "pageviews",
dimensions = "pagePath",
segments = my_segment)
# Go ahead and do a quick inspection of the data that was returned. This isn't required,
# but it's a good check along the way.
head(ga_data)
pagePath | segment | pageviews |
---|---|---|
/ | Mobile Traffic | 269 |
/?__hstc=205162639.2492ee4e2514a59ed226f9dc5224e8b6.1537248787762.1537248787762.1537248787762.1&__hssc=205162639.1.1537248787763&__hsfp=2964561211&hsCtaTracking=8bc9e3c4-0d81-453f-9aa0-e7ee2d150e86|4361a058-c61b-4634-98be-95a3d7d9cb0c | Mobile Traffic | 1 |
/?__hstc=205162639.3ee49b04f9a3ca95eff26980d8739ac8.1537165640249.1537165640249.1537165640249.1&__hssc=&hsCtaTracking=8bc9e3c4-0d81-453f-9aa0-e7ee2d150e86|4361a058-c61b-4634-98be-95a3d7d9cb0c | Mobile Traffic | 1 |
/about/ | Mobile Traffic | 43 |
/about/career-spotlights/ | Mobile Traffic | 1 |
/about/careers/ | Mobile Traffic | 83 |
Since we didn’t sort the data when we queried it, let’s go ahead and sort it here and grab just the top 10 pages.
# Using dplyr, sort descending and then grab the top 10 values. We also need to make the
# page column a factor so that the order will be what we want when we chart the data.
# This is a nuisance, but you get used to it. That's what the mutate function is doing
ga_data_top_10 <- ga_data %>%
arrange(-pageviews) %>%
top_n(10) %>%
mutate(pagePath = factor(pagePath,
levels = rev(pagePath)))
# Take a quick look at the result.
head(ga_data_top_10)
pagePath | segment | pageviews |
---|---|---|
/ | Mobile Traffic | 269 |
/open-positions/ | Mobile Traffic | 124 |
/about/careers/ | Mobile Traffic | 83 |
/solutions/industries/ | Mobile Traffic | 49 |
/solutions/partners/adobe/adobe-launch/dtm-launch-assessment/ | Mobile Traffic | 48 |
/about/ | Mobile Traffic | 43 |
This won’t be the prettiest bar chart, but let’s make a horizontal bar chart with the data. Remember, in ggplot2, a horizontal bar chart is just a normal bar chart with coord_flip()
.
# Create the plot. Note the stat="identity"" (because the data is already aggregated) and
# the coord_flip(). And, I just can't stand it... added on the additional theme stuff to
# clean up the plot a bit more.
gg <- ggplot(ga_data_top_10, mapping = aes(x = pagePath, y = pageviews)) +
geom_bar(stat = "identity") +
coord_flip() +
theme_light() +
theme(panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
panel.border = element_blank(),
axis.title.y = element_blank(),
axis.ticks.y = element_blank())
# Output the plot. You *could* just remove the "gg <-" in the code above, but it's
# generally a best practice to create a plot object and then output it, rather than
# outputting it on the fly.
gg
This site is a sub-site to dartistics.com