Reputation: 3294
I was going through a tutorial on time series. In that I found something related to this:
for i in (train,test):
print(i)
Now, my expectation was that we are iterating over a tuple of train
and test
. But surprisingly, I found that it processed all of the train
data first followed by the test
data. What is actually happening behind the scenes?
EDIT : Train and test are panda dataframes. Assume the code is
for i in (a,b):
print(i)
Then the output
In case of lists:
[1,2,3]
[2,4]
In case of dataframes:
0
0 1
1 2
2 3
0
0 2
1 4
Upvotes: 1
Views: 486
Reputation: 9385
In python you can create a tuple (i.e., an immutable list) by doing (1, 2, 3)
. This is similar to how you can create a list [1, 2, 3]
. What you are doing in the for-loop is creating a tuple of length two, with entries train
and test
, then looping over them.
The following prints 1, 2, and 3:
my_tuple = (1, 2, 3)
for i in my_tuple:
print(i)
... same as this:
for i in (1, 2, 3):
print(i)
The reason your tutorial is doing this as a loop is simply that the operations need to do prediction on train and test are identical.
An example which is probably closer to what your tutorial is doing is the following:
train = load_train_data()
model = train_model(train)
test = load_test_data()
for dataset in (train, test):
predictions = model.predict(dataset)
print(predictions)
Which is just the same as:
train = load_train_data()
model = train_model(train)
test = load_test_data()
train_predictions = model.predict(train)
print(train_predictions)
test_predictions = model.predict(test)
print(test_predictions)
Upvotes: 2