2013年8月7日星期三

Hbase 's issue

Join There are two tables
TBA
ROWKEY: datetime_userid
TBB
ROWKEY: userid / / account id
cf: username / / User Account

TBA 's ROWKEY only for SCAN
but the data is returned , hoping to get a data structure datetime_username
ask you experts , seeking a highly efficient join algorithm
------ Solution -------------------------- ------------------
hbase more trouble doing JOIN
If SCAN data out much Merge method can refer to the RDBMS

my ideas:
1, define a String array object TBA_ARR
2, the SCAN TBA data thrown TBA_ARR
while flipping data structures : datetime_userid => userid_datetime
3, right TBA_ARR sort, and get all the non-repetition of the userid into a new String array object TBB_ARR
Meanwhile, the TBB_ARR the userid thrown List sget
List<Get> sget = new ArrayList<Get>();
String LastUserID = "";
String[] UserID = new String[RowKeyA.length];
ScanCount = 0;
for (int i = 0; i < RowKeyA.length; i++) {
if (LastUserID.equals(RowKeyA[i].split("_")[0]))
continue;
LastUserID = RowKeyA[i].split("_")[0];
sget.add(new Get((LastUserID).getBytes()));
UserID[ScanCount] = LastUserID;
ScanCount++;
}

4, through HTable.get (List arg0) method , one-time access to all the data in TBB , and put TBB_ARR in
5, due TBA_ARR and TBB_ARR are lexicographically ordered, so I finally just use a round robin , will be able to piece together the two arrays of data

If SCAN data is very large, need to do a lot of logical analysis , I suggest using MapReduce + Hbase
------ For reference only --------------- ------------------------
some thing ?
own Dingding
------ For reference only ------------------------------- --------

good program , thank you, I go first to a few colleagues to digest digest

没有评论:

发表评论