TBA
ROWKEY: datetime_userid
TBB
ROWKEY: userid / / account id
cf: username / / User Account
TBA 's ROWKEY only for SCAN
but the data is returned , hoping to get a data structure datetime_username
ask you experts , seeking a highly efficient join algorithm
------ Solution -------------------------- ------------------
hbase more trouble doing JOIN
If SCAN data out much Merge method can refer to the RDBMS
my ideas:
1, define a String array object TBA_ARR
2, the SCAN TBA data thrown TBA_ARR
while flipping data structures : datetime_userid => userid_datetime
3, right TBA_ARR sort, and get all the non-repetition of the userid into a new String array object TBB_ARR
Meanwhile, the TBB_ARR the userid thrown List
List<Get> sget = new ArrayList<Get>();
String LastUserID = "";
String[] UserID = new String[RowKeyA.length];
ScanCount = 0;
for (int i = 0; i < RowKeyA.length; i++) {
if (LastUserID.equals(RowKeyA[i].split("_")[0]))
continue;
LastUserID = RowKeyA[i].split("_")[0];
sget.add(new Get((LastUserID).getBytes()));
UserID[ScanCount] = LastUserID;
ScanCount++;
}
4, through HTable.get (List
5, due TBA_ARR and TBB_ARR are lexicographically ordered, so I finally just use a round robin , will be able to piece together the two arrays of data
If SCAN data is very large, need to do a lot of logical analysis , I suggest using MapReduce + Hbase
------ For reference only --------------- ------------------------
some thing ?
own Dingding
------ For reference only ------------------------------- --------
good program , thank you, I go first to a few colleagues to digest digest
没有评论:
发表评论