2013年8月21日星期三

hadoop in pig delimiter issue

File test.txt column delimiter is CRTL + A, that is, \ 001 ,

run raw = LOAD '... / test.txt' USING PigStorage ('\ 001') AS (a, b, c);

can PigStorage does not seem to recognize \ 001 error.

question :
1: without replacing the file delimiters premise, how to solve this problem ?
2: If delimiter ( not necessarily \ 001 ) with regex , then the load statement how to write ?
------ Solution ---------------------------------------- ----
1, with sed statement to ' \ 001 ' replace it, replace recognizable symbols.

2, raw = LOAD '... / test.txt' USING PigStorage (' Your regex ') AS (a, b, c);


------ For reference only ---------------------------------- -----
look younger issues now, thank you ,
not know the answer , but also holding individual field ah ! !
------ For reference only -------------------------------------- -
  This reply was moderator deleted at 2012-02-02 10:39:00

------ For reference only ---------------------------------- -----
this question nobody know ? ? ? . . . . . . .
------ For reference only -------------------------------------- -
knot post, Meirenhuida also end it

没有评论:

发表评论