Reputation: 87
I'm using Python version 2.4 with Pyspark.
I'm having a problem with how to pass a parameter to get the year and month of execution inside the where.
How can I do?
import pyspark
from datetime import datetime, timedelta
from os.path import expanduser, join, abspath
from pyspark import SparkContext
from pyspark.sql import SQLContext
from pyspark.sql import HiveContext
import datetime
import sys, os, logging, getopt
sc = SparkContext()
hc = HiveContext(sc)
sql = SQLContext(sc)
hc.sql(""" SELECT * FROM bd_raw_data.table_iop WHERE pt_year = 2022 AND pt_month = 1 """).registerTempTable("temp_df_table_iop")
Upvotes: 0
Views: 90
Reputation: 1739
You can simply use strings
in that case as below -
year = <your year>
month = <your month>
hc.sql("""SELECT * FROM bd_raw_data.table_iop WHERE pt_year = {year} AND pt_month = {month}""".format(year=year,month=month)).registerTempTable("temp_df_table_iop")
Upvotes: 2
Reputation: 3
Do you want to pass the parameter inside the query?
hc.sql(""" SELECT * FROM bd_raw_data.table_iop WHERE pt_year = 2022 AND pt_month = 1 """).registerTempTable("temp_df_table_iop")
Upvotes: 0