SPSS dataset logic is not always logical. However, for working proficiently with datasets, just a handful of basics is sufficient. These are explained in this tutorial.
This tutorial focuses on working with SPSS datasets. For a definition and some background on datasets, see SPSS Datasets.
Working with SPSS Datasets
- It is recommended you follow along with the steps in this tutorial. You can copy-paste-run the syntax we'll use on idols.sav and service_provider.sav.
- We'll first set our CD to the folder where the files are located. Next, we'll open one of them and compute some test variable.
*Set working directory and open data file.
get file 'idols.sav'.
Untitled DatasetsAn Untitled Dataset in SPSS
- Note the empty square brackets in the left top corner. These mean that this is an untitled dataset. This is because we haven't assigned a name to it.
- Something specific to an untitled dataset is that it is closed as soon as another dataset is opened. Any changes made to it are discarded.
- For a quick demonstration, run . You'll see that the previous dataset has now been replaced by a new (untitled) one.
- Datasets can be prevented from being closed by naming them with .
- Dataset names don't need quotes around them and must comply with the naming rules for variables.
*Open idols.sav and apply name to dataset.
get file 'idols.sav'.
dataset name idols_data.
*Open service_provider.sav and apply name to dataset.
get file 'service_provider.sav'.
dataset name service_data.
*Compute test variable.
compute test_0 = 0.
Now you have two open datasets. The first didn't close upon opening the second because a name ("idols_data") was applied to it.
The Active DatasetThe Active Dataset in SPSS
- In the previous syntax we also computed a new variable. Upon inspection, you'll see it's present in service_data but not in idols_data.
- This is because service_data was the active dataset when we ran the command.
- By default, the active dataset is usually the data you opened or clicked on last. In the windows task bar, the active dataset can be recognized by a red cross in its icon.
- If we want to run syntax on one of the inactive datasets, we'll first activate it. Don't do this by clicking it.
*Compute test variable in idols_data.
dataset activate idols_data.
compute test_1 = 1.
Activating idols_data before the command ensures that the new variable will be created in this dataset.
Closing SPSS Datasets
- When we're done with the data we'll close both datasets. (We'll usually first save them as data files. Without doing so, our changes are discarded. This is explained in SPSS Datasets.
- A peculiarity here is that the last open dataset actually stays open. However, its name is removed so it will be gone as soon as other data are opened.
- Alternatively, if you really want it closed, run after closing the dataset.
*Close datasets. Alternatively, use "dataset close all." instead of the two lines below.
dataset close idols_data.
dataset close service_data.
*Get rid of the last open dataset.
Previous tutorial: Convert Characters to Codepoints and Reversely
Next tutorial: Two Digit Year in String – Cautionary Note
This tutorial explains what SPSS datasets are. For a practical tutorial on working with datasets, see SPSS Datasets Tutorial 1 - Basics.
Right, now an SPSS dataset is SPSS data that only exists in your computer's working memory (RAM). Changes you make to it are discarded unless it's saved as a data file.
SPSS Dataset versus SPSS Data File
- "SPSS data file" refers to data that exists on a storage device (such as a Hard Disk or a USB stick).
- Obviously, switching your computer off and back on does not affect an SPSS data file.
- By opening an SPSS data file, it's copied to your computer's working memory (RAM). This copy is referred to as an SPSS dataset.
- Importantly, changes you make to a dataset are recorded only in working memory. Such changes are therefore discarded if you close a dataset (perhaps accidentally).
- This is usually no big issue as long as you work from syntax. In this case, just reopen the data file and rerun the syntax you used for your modifications.
The Active Dataset
- Starting from SPSS version 14, you can have multiple Datasets open at once.
- However, there's always oneactive dataset. This is the only dataset to which the commands you run are applied. (An exception are commands that may address multiple datasets at once such as ).
- The active dataset is usually the one you opened last. However, you can switch to a different one with the command.
- In order to distinguish between open Datasets, you can assign a name to each with the command.
- If assigned, the dataset name is shown between square brackets right behind the data file name. (See screenshot above.)
- When desired, these names can be used in syntax for explicitly addressing different datasets.