sedavidw
sedavidw

Reputation: 11691

Breaking down a map[chararray] in pig latin

I am very new to pig latin so excuse any ignorance in the following question. I have inherited some code that does essentially the following:

USERS = LOAD 'hbase://some_table' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('s:*', '-caster HBaseBinaryConverter --limit some_limit') AS (user_map:map[chararray]);

Now if I do a dump of USERS I get something like the following (fake data)

([n1#{"s":{"added": 1430668638000, "lastseen": 1430668638000, "expires": 1433260638000}},n2#{"s":{"added": 1430668638000, "lastseen": 1430668638000, "expires": 1433692638000}},n22#{"segment":{"added": 1430668638000, "lastseen": 1430668638000, "expires": 1433260638000}},n3#{"s":{"added": 1430668638000, "lastseen": 1430668638000, "expires": 1433692638000}},n4#{"segment":{"added": 1430668638000, "lastseen": 1430668638000, "expires": 1433692638000}}])
([n8#{"s":{"added": 1428792426000, "lastseen": 1428792426000, "expires": 1431816426000}},n9#{"segment":{"added": 1428792426000, "lastseen": 1428792426000, "expires": 1431816426000}},n11#{"segment":{"added": 1428792426000, "lastseen": 1428792426000, "expires": 1431816426000}}])

Essentially I want to get at the n* values in the output. But I am not exactly sure how to break them down from this schema. Any help would be greatly appreciated.

To explain my question a bit more, perhaps my understanding of the map:[chararray] schema (and how to manipulate it) is lacking

EDIT My desired expected output would be storing all of n* information into a variable called TITLES. This way when I do DUMP TITLES I would get the following

n1#
n2# ...

Upvotes: 1

Views: 86

Answers (1)

sedavidw
sedavidw

Reputation: 11691

Was able to answer my own question by writing a python UDF. In Pig the call looks like this

N_S = FOREACH USERS GENERATE my_udfs.translate_map(user_map)

My python udf looks something like

@outputSchema("doc:chararray")
def translate_map(input):

    n_str = ""
    for k, v in input.items():

        n_str += str(key)
        n_str += " "

    return n_str

Upvotes: 1

Related Questions