Reputation: 2534
Is it possible to call a script and run it before running the rest of a script?
My goal is to perform a set-up script which will download and organize the data necessary to perform my main query.
I am looking for something like:
create table logcontent (content string) row format delimited fields terminated by '\n';
**call secondary hive script with date-range arguments and download necessary logs into <logcontent>**
**perform the rest of the query**
I want to do this in order to create a nice abstraction for the table setup so that the end-user does't have to worry about table set-up, it will be done for them.
I know that AWS has the option to add a Hive script as a step in the job but how can I do the same thing locally? Is this possible? If so, what is the syntax? If not, what are some work-arounds?
Upvotes: 2
Views: 1139
Reputation: 2574
You could try something like this:
create table logcontent (content string) row format delimited fields terminated by '\n';
&& sh /path/to/script.sh
&& **perform the rest of the query**
The &&
symbol is to execute the subsequent command after the former command finishes successfully.
Upvotes: 1
Reputation: 384
The answer is to organize your main shell script in a similar template as below.
## Content of main.sh
## Code block to setup Hadoop Environment and config in Path, if not already exist.
## Step 1> Create the hive table in non-interactive mode.
hive -e "create table test(id int, name string) row format delimited fields terminated by '\n'"
# Check if the command is successful. IF else logic can be added.
echo $?
## Step 2> Call the secondary script executable to download logs
ksh downloadlogs.sh # Assuming the download script could be invoked this way.
## Step 3> Execute rest of the hive queries to organize data
hive -e "select * from test"
Upvotes: 1