深入Android HAL binder

HAL binder是Android O(8.0)专门用于HAL(Hardware Abstract Layer)层(native)进程与其clients之间的通信机制(clients可以是native进程,也可以是Java Framework进程)。 HAL binder替代了早先使用的socket通信,其kernel层实际是基于原有的binder驱动,但为了配合Client与Server之间的数据传输,需要使用特定的中间层HIDL来进行接口与数据的转换。那么,相对之前的HAL通信方式(socket),基于HIDL的HAL通信有什么优势了?从系统架构的角度,HIDL为客户端与服务端提供了清晰的接口;从效率的角度,binder IPC实际在传输数据上只有一次拷贝,而socket实际传输需要两次数据拷贝

目前Android有两种类型的HAL:

相关代码

  • /android/system/tools/hidl/: 根据HAL接口 .hal来产生相应的Proxy(client端接口)以及stub(server端接口)。
  • /android/system/libhwbinder/: 初始化binder线程,负责与binder驱动交互,读写数据;
  • /android/system/libhidl: HIDL状态与HAL服务管理接口;
  • /android/os/HwBinder: Java层Hardware binder代码,获取server端接口以及向server发起IPC调用;
  • android_os_HwBinder.cpp (JNI):Java层HAL binder的JNI代码,负责将Java层的请求传递给相应的server进程;
  • /android/hardware/interfaces/: 各个模块HAL层接口,每个模块都包含了一个Android.bp的脚步来生成对应的代理(Proxy)与存根对象(stub),如radio接口IRadio.hal,sensors接口ISensors.hal
  • /android/kernel/drivers/staging/android/: binder驱动;

下图是HAL binder的结构简图, 了解Binder的同学应该很快能看出,这个结构跟binder的C/S IPC架构很相似,区别的地方在对于HAL binder来说,server进程是Android的native进程而已。在接下来的两篇文章里,我将以Telephony Framwork(RILJ)与native进程RILD是如何通过hardware binder来进行通信为例,从以下两个方面来说明HAL Binder的实现机制与工作原理:

  • HAL服务管家HwServiceMananger是如何启动以及如何注册、获取系统服务;
  • RILJ如何通过HAL binder与RILD进行通信?

HwBinder Architecture

关于Android.bp脚本可以参考https://android.googlesource.com/platform/build/soong/

这篇文章,主要来看下第一个问题,HAL binder是如何启动以及管理所有HAL服务的。跟常规的binder通信(AMS,PMS等使用的binder)一样,HAL binder也需要有一个专门的服务管家,来统一管理系统的服务,同时为客户端提供诸如注册、获取服务等API。

HwServiceMananger的启动

/android/system/hwservicemanager/hwservicemanager.rc目录下,有个启动脚本hwservicemanager.rc,init进程在启动之后会对该脚本进行解析:

1
2
3
4
5
6
7
8
9
10
11

service hwservicemanager /system/bin/hwservicemanager
user system
disabled
group system readproc
critical
onrestart setprop hwservicemanager.ready false
onrestart class_restart hal
onrestart class_restart early_hal
writepid /dev/cpuset/system-background/tasks
class animation

/android/system/core/rootdir/init.rc文件中,有个控制指令,专门来启动hwservicemananger这个系统服务:

1
2
3
4
5
6
7
8
9
10
11
12
13

on post-fs
# Load properties from
# /system/build.prop,
# /odm/build.prop,
# /vendor/build.prop and
# /factory/factory.prop
load_system_props
# start essential services
start logd
start servicemanager
start hwservicemanager
start vndservicemanager

这样,系统加载hwservicemanager,进入main函数:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62

class BinderCallback : public LooperCallback {
public:
BinderCallback() {}
~BinderCallback() override {}

int handleEvent(int /* fd */, int /* events */, void* /* data */) override {
IPCThreadState::self()->handlePolledCommands();
return 1; // Continue receiving callbacks.
}
};

int main() {
// 初始化hwbinder驱动,设置线程池
configureRpcThreadpool(1, true /* callerWillJoin */);

ServiceManager *manager = new ServiceManager();

if (!manager->add(serviceName, manager)) {
ALOGE("Failed to register hwservicemanager with itself.");
}
// TokenManager有何作用?
TokenManager *tokenManager = new TokenManager();

if (!manager->add(serviceName, tokenManager)) {
ALOGE("Failed to register ITokenManager with hwservicemanager.");
}

sp<Looper> looper(Looper::prepare(0 /* opts */));

int binder_fd = -1;
// 设置线程进入等待状态,准备处理IPC请求
IPCThreadState::self()->setupPolling(&binder_fd);
if (binder_fd < 0) {
ALOGE("Failed to aquire binder FD. Aborting...");
return -1;
}
// Flush after setupPolling(), to make sure the binder driver
// knows about this thread handling commands.
IPCThreadState::self()->flushCommands();
// IPC请求回调,有数据时会调用该回调进行处理
sp<BinderCallback> cb(new BinderCallback);
if (looper->addFd(binder_fd, Looper::POLL_CALLBACK, Looper::EVENT_INPUT, cb,
nullptr) != 1) {
ALOGE("Failed to add hwbinder FD to Looper. Aborting...");
return -1;
}

// Tell IPCThreadState we're the service manager
sp<BnHwServiceManager> service = new BnHwServiceManager(manager);
IPCThreadState::self()->setTheContextObject(service);

ioctl(binder_fd, BINDER_SET_CONTEXT_MGR, 0);
...
rc = property_set("hwservicemanager.ready", "true");

while (true) {
looper->pollAll(-1 /* timeoutMillis */);
}

return 0;
}

ServiceManager启动主要需要做以下几件事情:

  • 初始化/dev/hwbinder驱动,为其分配一块虚拟内存用于IPC数据交换;
  • hwbinder注册HAL服务管家(IPC上下文管理者);
  • 监听/dev/hwbinder是否有数据可读,如有则调用回调执行指令;

hwbinder驱动初始化

HwServiceManager启动的第一件事情,就是配置binder线程池,告知驱动当前需要启动多少个线程处理IPC指令:对于HwServiceManager而言,只需要一个binder线程,即调用者线程即可(调用者线程何时加入binder线程池下面会讲到),而无需启动新的线程。

1
2
3
4
5

// HidlBinderSupport.cpp
void configureBinderRpcThreadpool(size_t maxThreads, bool callerWillJoin) {
ProcessState::self()->setThreadPoolConfiguration(maxThreads, callerWillJoin /*callerJoinsPool*/);
}

ProcessState只有一个实例(ProcessState可以看成是hwbinder用户空间进程状态管理者),因此在调用ProcessState():self的时候,如果当前ProcessState没有创建全局实例,则会创建一个实例:

1
2
3
4
5
6
7
8
9
10

sp<ProcessState> ProcessState::self()
{
Mutex::Autolock _l(gProcessMutex);
if (gProcess != NULL) {
return gProcess;
}
gProcess = new ProcessState;
return gProcess;
}

创建ProcessState的时候,主要做两件事:

  • 打开/dev/hwbinder驱动,验证binder版本以及协议版本,并配置最大的binder线程数量(对于hwbinder,默认的最大线程数量为0);
  • /dev/hwbinder映射到一块大小约为1M的虚拟内存上用于交换IPC数据;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

ProcessState::ProcessState()
: mDriverFD(open_driver()) // 配置最大线程数
, mVMStart(MAP_FAILED)
, mThreadCountLock(PTHREAD_MUTEX_INITIALIZER)
, mThreadCountDecrement(PTHREAD_COND_INITIALIZER)
, mExecutingThreadsCount(0)
, mMaxThreads(DEFAULT_MAX_BINDER_THREADS)
, mStarvationStartTimeMs(0)
, mManagesContexts(false)
, mBinderContextCheckFunc(NULL)
, mBinderContextUserData(NULL)
, mThreadPoolStarted(false)
, mSpawnThreadOnStart(true)
, mThreadPoolSeq(1)
{
if (mDriverFD >= 0) {
// mmap the binder, providing a chunk of virtual address space to receive transactions.
mVMStart = mmap(0, BINDER_VM_SIZE, PROT_READ, MAP_PRIVATE | MAP_NORESERVE, mDriverFD, 0);
if (mVMStart == MAP_FAILED) {
// *sigh*
ALOGE("Using /dev/hwbinder failed: unable to mmap transaction memory.\n");
close(mDriverFD);
mDriverFD = -1;
}
}
else {
ALOGE("Binder driver could not be opened. Terminating.");
}
}

ProcessState初始化完成后,获取到相应实例,配置线程池(对于每个HAL服务来是,一般线程池只有一个线程,就是服务进行注册时所在的
主线程),告知线程当前所需的最大线程数目:

1
2
3
4
5
6
7
8
9
10
11

status_t ProcessState::setThreadPoolMaxThreadCount(size_t maxThreads) {
status_t result = NO_ERROR;
if (ioctl(mDriverFD, BINDER_SET_MAX_THREADS, &maxThreads) != -1) {
mMaxThreads = maxThreads;
} else {
result = -errno;
ALOGE("Binder ioctl to set max threads failed: %s", strerror(-result));
}
return result;
}

注册HAL服务管家

线程池配置完成后,获取当前线程的Looper(对的,就是Framework层常用的Handler消息循环在native的对应),并告知HwBinder驱动由当前线程来处理HAL服务上下文相关的指令, 设置Looper消息回调。接着,就是注册HAL服务的管家ServiceManager了。

ServiceManager继承了IServiceManager接口,负责管理系统中所有的HAL服务,为其他进程提供服务注册、查找等功能:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63

/**
* Manages all the hidl hals on a device.
*
* Terminology:
* Package: "android.hidl.manager"
* Major version: "1"
* Minor version: "0"
* Version: "1.0"
* Interface name: "IServiceManager"
* Fully-qualified interface name: "android.hidl.manager@1.0::IServiceManager"
* Instance name: "manager"
* Fully-qualified instance name: "android.hidl.manager@1.0::IServiceManager/manager"
*/
interface IServiceManager {

// Retrieve an existing service that supports the requested version.
get(string fqName, string name) generates (interface service);

/**
* Register a service. The service manager must retrieve the (inherited)
* interfaces that this service implements, and register them along with
* the service.
*
*/
add(string name, interface service) generates (bool success);

enum Transport : uint8_t {
EMPTY,
HWBINDER,
PASSTHROUGH,
};

// Get the transport of a service.
getTransport(string fqName, string name) generates (Transport transport);

// List all registered services. Must be sorted.
list() generates (vec<string> fqInstanceNames);

// List all instances of a particular service. Must be sorted.
listByInterface(string fqName) generates (vec<string> instanceNames);

/**
* Register for service notifications for a particular service. Must support
* multiple registrations.
*/
registerForNotifications(string fqName,
string name,
IServiceNotification callback)
generates (bool success);

...

/**
* When the passthrough service manager returns a service via
* get(string, string), it must dispatch a registerPassthroughClient call
* to the binderized service manager to indicate the current process has
* called get(). Binderized service manager must record this PID, which can
* be retrieved via debugDump.
*/
// 注册直通式客户端
registerPassthroughClient(string fqName, string name);
};

接着, 让BnHwServiceManager成为HAL服务的上下文管家,需要做两件事:一是告知IPCThreadState负责上下文管理的对象,用于接收来自其他进程的IPC请求;一是告知kernel上下文管理者的是一个handle为0的IBinder对象,kernel会为其保存一个节点,以便IPC时使用。

1
2
3
4
5
6

// Tell IPCThreadState we're the service manager
sp<BnHwServiceManager> service = new BnHwServiceManager(manager);
IPCThreadState::self()->setTheContextObject(service);
// Then tell binder kernel
ioctl(binder_fd, BINDER_SET_CONTEXT_MGR, 0);

BnHwServiceManagerServiceManager服务端的stub对象,与客户端代理ProxyBpHwServiceManager相对应,具体怎么来的,在下一篇文章中再详细讲述

最后,设置hwservicemanager.ready,表面当前hwservicemanager已经处于可用状态;pollAll则表示hwservicemanager进入消息循环等待的过程,一旦/dev/hwbinder有数据可读,就会调用之前注册的BinderCallback进行处理。

1
2
3
4
5
6

rc = property_set("hwservicemanager.ready", "true");

while (true) {
looper->pollAll(-1 /* timeoutMillis */);
}

在接下来的一篇文章,基于TelephonyRILD的通信来讲述HAL服务的具体工作流程。

参考文献