Categories
Public Reporting

News Item Search Analysis: Byline Search

Background

The Mason core website – and other Mason Drupal websites – provide Mason-related news items, which are searchable.

The news item search on the Mason core website can be found at: www2.gmu.edu/latest-news. Here is a screenshot of the news item search form:

The news item search form allows the user to search for news items by byline (author), date range, tags, and topics. The news item search does not provide an option to search by keywords found in the news item titles and full-text content.

Anecdotally, we had heard reports that users were mistaking the byline search field for a keyword search field and were trying to enter a keyword in the byline field. In these cases, the search was not performing as users would expect.

The Question

Are users mistakenly using the byline field as a keyword search?

The Analysis

To answer this question, I pulled data from Google Analytics on the use of the news item search by off-campus users for the past two full months (August and September 2017).

Because I wanted to analyze news item search usage data from off-campus users only (to filter out internal office activity), I used a custom segment in Google Analytics to focus on only off-campus traffic. When using custom segments in Google Analytics, the generated reports are more likely to rely on data sampling. Indeed, when trying to pull this data for the specified two-month date range using the off-campus custom segment, the report did use sampled data. Sometimes data sampling is not a problem, but in this case, with the use of news item search a relatively small percentage of overall traffic, sampling could potentially make a substantial difference in the results. To get around this, I had to pull two reports of one month each, and then combined the data. By pulling a smaller date range of just one month, Google Analytics did not use sampled data.

I looked specifically for URLs representing use of the news item search feature in which a search term was entered in the byline field.

I then extracted the byline field search term from the results to see how often the search term was appropriate for the byline field (where the user was intending to search by byline), and how often it was not (where the user was actually attempting a keyword search).

Report Specifications

Account: Mason Office of Communications and Marketing 01
Property: www2.gmu.edu
View: [PROD] www2.gmu.edu – default 2.0 (2017-06-25)
Report: Behavior -> Site Content -> All Pages
Segment: Off-Campus
Date Range (Report 1): Aug 1, 2017 – Aug 31, 2017
Date Range (Report 2): Sep 1, 2017 – Sep 30, 2017

Now we need to filter for only URLs representing searches on news items. The news items search is located at /latest-news/. If a search has been requested, the URL includes the user’s provided search parameters, which are delimited with a question mark in the URL.

For example, here is a URL that represents a search on the byline field for the term “test”:

https://www2.gmu.edu/latest-news?field_byline_target_id_value=test&created%5Bmin%5D=&created%5Bmax%5D=&field_news_tags_tid=All&field_content_topics_target_id=&field_byline_target_id=&field_content_topics_target_id_value=

Note that the byline search term the user entered is found in the field_byline_target_id_value parameter.

Therefore we need to filter for URLs starting with the path “/latest-news/” followed by a question mark, using the following regular expression:

Filter: Include -> Page -> Matching RegExp -> ^/latest-news\?
(The ‘latest news’ page, with a search.)

However, since we are only interested in searches that included a term in the byline field, we only want to see those searches that include a byline field search term. We can use the following regular expression to further filter for URLs in which the byline field value parameter is not immediately followed by an ampersand (which delineates the next parameter) – in other words, a byline search term has been provided:

Filter: Include -> Page -> Matching RegExp -> field_byline_target_id_value=[^&] (A URL that contains the ‘field_byline_target_id_value’ parameter, and which is NOT immediately followed by an ampersand.)

In summary, this provides a list of all the URLs that represent searches by off-campus users in which the user performed a search on the news items and specified a term in the byline field.

I then combined the two months of data and extracted the byline search terms from the URLs.

Results

Here are the search terms used in the byline field by off-campus users during the months of August and September 2017:

Search Term Relevant to Byline Notes
agbiboa No Mason professor. Not a news writer.
angel cabrera No  
ángel cabrera No  
award No  
binge watchers No  
cabrera No  
car brain activity No  
car brain activity simulator No  
card No  
cirque No  
daca No  
devos title ix No  
dhs No  
drive brain activity simulator No  
emotional support dogs No  
george mason inn No  
healthcare
benefits
No  
hiv No  
inova No  
journal of gerontology No  
kearney Maybe Perhaps refers to Colleen Kearney Rich, who does write news items, but could also refer to this: http://business.gmu.edu/kearney/
loudoun No  
medieval
disease
No  
mercatus No  
michael buschmann No A Mason professor, but not an author of our news items.
mobility No  
move
in
No  
over50 No  
over50 tv No  
peterson No A person’s name, but not not an author of our news items.
scout No  
sid dewberry No  
sid dewberry music No  
sklarew No Mason professor. Not a news writer.
statement from president cabrera No  
states No  
stearns center for teaching and learning No  
total eclipse No  
total solar eclipse No  
tv No  
water No  
welcome2mason No  
wiley No Perhaps refers to Wiley project, although we also have a couple professors by this name.
wilkin No Perhaps refers to Roger Wilkins, form who the north plaza was renamed.
zimmerman employee of the month No  

Looking at the search terms above, it is clear that most people are attempting to use the byline field as a keyword search. Very few, if any, off-campus users are using the byline field for its intended purpose.

Recommendations

  • If possible, we should offer a keyword search for news items. People appear to expect this functionality.
  • Even though the field is labelled “Byline”, it seems that people aren’t getting the message. Perhaps labeling the field “Author” would make it more clear to users.
  • Moving the byline search field further down the form from its position as the first search field may make users less likely to mistake it for a keyword search field.