Lab 1F
Directions: Follow along with the slides, completing
the questions in blue on your
computer, and answering the questions in red in your
journal.
Space, Click, Right Arrow or swipe left to move to
the next slide.
caseid: Anonymous ID of survey taker.V1: The age of the respondent.V2: The sex of the respondent.V3: Whether the person is employed full-time or
part-time.V4: Whether the person has a physical difficulty.V5: How long the person sleeps, in minutes.V6: How long the survey taker spent on homework, in
minutes.V7: How long the respondent spent socializing, in
minutes.rename function:atu_dirty.atu_dirty.
R will treat values that look like
numbers as if they were strings.Yes/No variables as
"1"/"0".structure of your data
and the variable descriptions from a few slides back:
R to think of our
“numeric” variables as numeric variables.as.numeric function.
## [1] 3.14
"3.14", but
as.numeric was able to turn it back into a number.sex variable uses "01"
and "02" for "Male" and "Female",
respectively."Male" and
"Female".R has a special name for categorical
variables, called factors.R also has a special name for the different
categories of a categorical variable.
sex and their counts, type:01’ means ‘Male’ and
‘02’ means ‘Female’ then we can use the
following code to recode the levels of sex.atu_cleaner…sex variable’s levels …"01" will now be "Male"…"02" will now be "Female".Recode the categorical variable about whether the person surveyed had a physical challenge or not. The coding is currently:
"01": Person surveyed did not have a physical
challenge."02": Person surveyed did have a physical
challenge.Write a script that:
atu_dirty datasetNOTE: You can watch this video to learn about RScripts:
The last few lines of your script are extremely important because they will save all of your work.
Be sure to View your data and check its
structure to make sure it looks clean and tidy before
saving.
Run the code below:
This code will create a new data frame in your
Environment called atu_clean which is a final copy
of atu_cleaner.
atu_clean is swept from your Environment
all of the changes you made will NOT be saved.To permanently save your changes you need to save the file as an
R data file or .Rda.
Run the code below:
atu_clean.Rda
file.
Now that you have learned some cleaning data basics, it’s time to
revisit the food data.
Import your food data
onto the Environment pane.Run the code below:
Use the as.factor() function to
convert healthy_level into a categorical variable and
re-run the histogram function.
healthy_level categories are now
numbers as opposed to tick-marks. This is an improvement but an even
better solution would be to recode the categories.Recode the
healthy_level categories and re-run the
histogram function.
If your food data is cleared from your
Environment, the changes that you made to the
healthy_level variable will not be saved.
To save your changes permanently, save your
food file as an R data file.