At the end of this section, you should feel confident enough to create and perform your own nonparametric hypothesis test.
In this section, you will find two examples of hypothesis testing using the nonparametric test statistics we discussed previously. Again, I suggest that you work through these problems as you read along. The examples will use the temperature dataset for London and Scotland. Each example will work off of the previous example and the complexity of the problem will increase. For each example, I will again pose a question and then work through the procedure discussed previously step by step. You should be able to easily follow along.
Ads are not only displayed in the newspaper anymore; they are seen online, through apps on your phone or tablet, and can be constantly updated or changed. Although new platforms for advertising may seem beneficial, the timing of an ad is everything. Check out this marketing strategy by Pantene:
WOMAN #1: Rain and humidity are pretty much the two worst things.
WOMAN #2: I end up looking like a wet cat.
WOMAN #3: My confidence was just totally ruined because of my hair.
WOMAN #4: If I go outside and it's really hot out, or humid, it can tend to freeze up really easily.
WOMAN #5: My halo of frizz goes up to here.
WOMAN #6: You feel like less than yourself. You're like I tried, but it looks like I didn't make an effort at all.
[ON SCREEN TEXT: Weather is the enemy of beautiful hair.]
WOMAN #7: My weather app is on the front of my phone because I use it every single day.
NARRATOR: We learned that women check the forecast the first thing in the morning to see what the weather will do to their hair. So Pantene brought in the world's most popular weather app to deliver a geo-targeted Patene hair solution to match the weather. When it was humid in Houston we recommended the Pantene smooth and Sleek collection for frizzy hair. When it was dry in Denver we recommended the Moisture Renewal Collection for dry hair. Pantene had a product solution for every kind of weather across the country.
The forecast proved right. Two times more media engagement. 4% sales growth. A 24% sales increase. And 100% chance of self-confidence, optimism, and sunny dispositions.
WOMAN #8: When I'm having a good hair day I feel absolutely amazing. Like I have like a different walk when I have a good hair day. I just like stomp down the street. I feel like Beyonce.
WOMAN #5: I unabashedly take a selfie.
WOMAN #6: And there's something really wonderful about that.
Depending on the weather conditions, an ad will appear that promotes a specific type of shampoo/conditioner that will counteract those conditions. The ad is time and location dependent; only displaying at the most advantageous time for a particular location.
From a marketing perspective, it would be advantageous to display ads for BBQ when people are most interested in BBQ. Since BBQ sales triple when the temperature reaches a particular threshold, we can use that information as a guideline for choosing the optimal advertising time. We should also take advantage of the time when people are querying for BBQ; online advertisement is a great way to market to thousands of people. So, how does the inquiry of BBQ relate to the BBQ temperature threshold? We can determine which week in a given year London and Scotland are most likely to observe temperatures above the BBQ temperature thresholds. But do the inquiries on BBQ peak during the same week?
To answer this question, I'm going to extract every date in which the temperature exceeded the BBQ temperature threshold for London and Scotland. Instead of working with day/month/year, I will work with ‘weeks'. That is what week (1-52) can we expect the temperature to exceed the BBQ threshold. Use the code below to extract out the dates and transform them into weeks:
Your script should look something like this:
# extract out weeks in which the temperature exceeds the threshold ScotlandBBQWeeks <- format(ScotlandDate[which(ScotlandTemp >= 20)],"%W") LondonBBQWeeks <- format(LondonDate[which(LondonTemp >= 24)],"%W")
The variable ScotlandBBQWeeks and LondonBBQWeeks represents every instance from 1980-2015 in which the temperature reached or was above the required BBQ threshold in the form of weeks (1-52). The range is between week 13 and week 42.
A natural follow-on question would be: does the peak inquiry week coincide with the first week in which we observe temperatures above the BBQ temperature threshold? This would be an optimal time to advertise for BBQ products online.
To answer this question, I'm going to extract the first week each year in which the temperature exceeds the BBQ threshold for London and Scotland. Use the code below to extract out the dates and transform the dates into weeks:
Your script should look something like this:
# for each year choose the first week in which the temperature exceeds the BBQ threshold
FirstLondonBBQWeek <- vector()
FirstScotlandBBQWeek <- vector()
numYears <- length(unique(format(LondonDate,"%Y")))
LondonYears <- unique(format(LondonDate,"%Y"))
ScotlandYears <- unique(format(ScotlandDate,"%Y"))
for(iyear in 1:numYears){
# London
year_index <- which(format(LondonDate,"%Y") == LondonYears[iyear])
tempDate <- LondonDate[year_index]
weeks <- format(tempDate[which(na.omit(LondonTemp[year_index]) >= 24)],"%W")
FirstLondonBBQWeek <- c(FirstLondonBBQWeek,weeks[1])
# Scotland
year_index <- which(format(ScotlandDate,"%Y") == ScotlandYears[iyear])
tempDate <- ScotlandDate[year_index]
weeks <- format(tempDate[which(ScotlandTemp[year_index] >= 20)],"%W")
FirstScotlandBBQWeek <- c(FirstScotlandBBQWeek,weeks[1])
}
Your script should look something like this:
# compute the Sign Test LondonSignScore <- SIGN.test(as.numeric(FirstLondonBBQWeek),md=22,alternative="two.sided",conf.level=0.95) ScotlandSignScore <- SIGN.test(as.numeric(FirstScotlandBBQWeek),md=22,alternative="two.sided",conf.level=0.95)