Scan查询过程步骤1.HTable.getScanner()关掉之前在server端打开的Scanner防止server端过多的资源占用client端防止server端过多的资源占用 client端ScannerCallable.call() -> close(scannerId) server端HRegionServer.c
- client端ScannerCallable.call() -> close(scannerId)
- server端HRegionServer.close(scannerId)
- client端ScannerCallable.call() -> openScanner(regionName,scan)
- server端
- 创建RegionScanner
- 把scanner加入server的map集合
- 为新生成的scanner创建Lease
步骤2. ResultScanner.next()
- client端cache.poll() 或者 next(scannerId, caching)
- server端HRegionServer.next(scannerId,nbRows)
- RegionScannerImpl.nextRaw(List outResults, int limit, String metric)
Scanner的种类
- Server端InternalScanner ResultScanner
- 其他HFileScanner、MetaScanner
1. InternalScanner
- 是server端内部较高层次的scanner抽象实现类:
- RegionScannerImpl
- StoreScanner
- KeyValueHeap
- 接口包括
- next()返回KeyValue List
- close()关闭scanner并释放server段资源
2. KeyValueScanner
- 是底层的scanner用来获取KeyValue实现类有
- StoreScanner
- StoreFileScanner
- KeyValueHeap
- NonLazyKeyValueScanner 每次都会做doRealSeek(forward)?reseek(kv):seek(kv);
- MemStoreScanner
- StoreScanner
- KeyValueHeap
- 常用接口
- peek()
- next()
- seek() 定位到指定的KeyValue
- reseek() 从当前scanner位置之后的定位到KeyValue
- requestSeek()
KeyValueHeap
- 在Region层面用来组合访问多个store在Store层面用来组合访问memstore和storefiles
- PriorityQueue存储Scanner,KVScannerComparator对scanner进行排序先比较peak的kv再比较SequenceID
- MemStoreScanner Long.MAX_VALUE
- StoreFileScanner SequenceID
- StoreScanner 0
- pollRealKV()从PriorityQueue中寻找可以做real seek的scanner
ScanQueryMatcher
- 在查找KV过程中确定是否包含当前KV以及接下来如何操作
- StoreScanner.getScanners(matcher) -> StoreFileScanner
- MatchCode的十种状态
- INCLUDE
- INCLUDE_AND_SEEK_NEXT_ROW : moreRowsMayExistAfter()getKeyForNextRow()
- INCLUDE_AND_SEEK_NEXT_COL : getKeyForNextColumn()
- DONE
- DONE_SCAN
- SEEK_NEXT_ROW : moreRowsMayExistAfter()
- SEEK_NEXT_COL : getKeyForNextColumn()
- SKIP : heap.next()
- SEEK_NEXT_USING_HINT : getNextKeyHint()
- NEXT(没用到): Do not include, jump to next StoreFile or memstore (in time order)
- public MatchCode match(KeyValue kv)
- 比较是否是相同row
- 比较版本是否过期
- 检查是否被删除
- 检查是否在time range
- Filters过滤
- ColumnTracker检查
- ColumnTracker
- ScanWildcardColumnTracker
- ExplicitColumnTracker
- DeleteTracker
- ScanDeleteTracker
- 针对删除的查询策略
- retainDeletesInOutput
- keepDeletedCellstrue不会再做删除检查
- seePastDeleteMarkers