Earth and Planetary Sciences, University of California, Davis

Author: meoskin

Linear Regression in R

I have been a R user for a long time (>20 years) and I have long been confused by the behavior of linear model objects generated by lm. The standard help file in R is not very helpful, so this week I put together an example code that hopefully explains this better. I am posting it here in case others find it useful:

# Part 1: Generate random data containing a linear relationship with noise
x = 1:100 + rnorm(100,sd=10) # x positions, scattered
m = 2 # input slope
b = 30 # input intercept
y = m*x + b + rnorm(100,sd=30) # y positions, with added scatter

# plot the data points we just generated
plot(x,y)

# regress y against x by generating a linear model.
# Note that the variable names and order matter here.
fit = lm(y~x) 
# Let's see how our prediction match the inputs
print(paste("slope: ",fit$coefficients[2]))
print(paste("intercept: ",fit$coefficients[1]))

# generate new set of x data to predict 
newx = -20:120 
# predict y values. 
# Note that we are assigning newx to x inside a data frame.
# This is important! If we just passed 'newx' to predict it 
# doesn't work.
newy = predict(fit,data.frame(x=newx))

# add best-fit line to plot.
lines(newx,newy) 

# Let's do this again with confidence interval for the best-fit line
# predict returns a matrix with named columns. 
# I wrap this in data.frame() to coerce the output to a data frame.
newy = data.frame(predict(fit,data.frame(x=newx),interval="confidence"))

# Note that the default level of confidence is 95%.
# For 99% confidence, as an example, add level=0.99 to the predict input.

# plot all three lines (confidence as dashed, all in blue)
lines(newx,newy$fit,col=4)
lines(newx,newy$lwr,col=4,lty=2)
lines(newx,newy$upr,col=4,lty=2)

# We can also plot the prediction interval
# (the predicted scatter of the data around our line):
newy = data.frame(predict(fit,data.frame(x=newx),interval="prediction"))

# Plot these as short dashed lines
lines(newx,newy$lwr,col=4,lty=3)
lines(newx,newy$upr,col=4,lty=3)

Flooding surface of the Gulf of California

Here I am pointing at an abrupt transition from bouldery alluvial-fan conglomerate to marine sandstone near the base of the Fish Creek-Vallecito basin. This marks the flooding of marine water into the Salton Trough about 6.5 Million years ago, into what was probably a closed sub-sea level basin at that time. I think it would have been amazing to witness this event.

Stuck in the mud

This happened on the last day of fieldwork in the eastern Sierra Nevada in May after a very wet winter. This road looked passable, but the thin dry crust hid a thick sticky layer of mud. My poor field truck was doomed. Had to pay for professional help to get pulled out of this mess.

Lassen Field Station

I am the faculty director for the Lassen Field Station, one of the newest additions to the University of California Natural Reserve System. Above is the view of Battle Creek Meadow from the staff campground at the Park Headquarters, where our field station is currently based. This is where I am now teaching our summer field course (GEL 110A).

Fragment of the Sierra

The photograph above was taken at Bodega Head in Sonoma County. Dramatic cliffs of granite rise up out of the Pacific Ocean here. This is the northernmost exposed outcrop of granitic rocks west of the San Andreas Fault. Slip on the fault juxtaposes rocks sliced off the Sierra Nevada batholith against the Franciscan subduction complex. But what about the rocks that were between the Sierra Nevada and the Franciscan complex? Shouldn’t those have been translated north also? Instead the granitic rocks meet the sea here, and that forearc basin is missing! This is a clue to events prior to formation of the San Andreas Fault that took place in Southern California during subduction.

What’s a blog for these days?

Being that this is a WordPress site, it comes with a blog function. It’s so 1999! So I put a field photo here from Isla Tiburon that I took in 1999. I know this will get stale because that’s what blogs do, so my plan is to keep it light and post interesting pictures and stories from my teaching and research experiences. The purpose is so that visitors (especially prospective students) can get a better idea of who I am and what it’s like to work with me. Enjoy!

Powered by WordPress & Theme by Anders Norén