High uplink traffic issue from old Android devices |
On June 13th, we received reports that the Android System was consuming significant data traffic on Android 5.1 devices. After analysing the issue, we found the high uplink traffic is from the mDNS system daemon. A recently added feature in Google Play services Version 17.1.22 used the NsdManager APIs which triggered a latent bug that existed in older versions of Android (i.e., pre-Android 8.0). The real issue is the old mDNS code -- mDNS can go into an infinite loop on some pre-Android 8.0 devices.
Google has released a new version of Google Play services which will not trigger the mDNS bug. However, since the mDNS APIs are public, any app can use these APIs and trigger this issue on these old devices again. As such, we strongly recommend partners to adopt the two patches in AOSP to prevent this issue from happening again.
-
Patch #1 is introduced in O to prevent mDNS going into an infinite loop caused by an integer overflow on timer ticks.
-
Patch #2 is introduced in N to disable mDNS on non-multicast, non-broadcast interfaces.
The above 2 patches are based on AOSP mDNS code which may be different than the code provided by your SoC. We strongly recommend you to work with your SoC provider to ensure the patches are properly implemented on your device. If the code provided by your SoC provider is different than the AOSP implementation, please review the detailed analysis below of the issue and how to patch.
Description of the problem
Under certain conditions, mdnsd will get into an infinite loop of calling mDNS_Execute with ticks = 1, i.e., 1024 times per second.
The problem is that the time adjustment code in mdnsd will cause overflow,.
The problem is this code in the mDNS_Lock_() function:
#define mDNS_TimeNow_NoLock(m) (mDNSPlatformRawTime() + (m)->timenow_adjust)
…
if (m->mDNS_busy == 0)
{
if (m->timenow)
LogMsg("%s: mDNS_Lock: m->timenow already set (%ld/%ld)", functionname, m->timenow, mDNS_TimeNow_NoLock(m));
m->timenow = mDNS_TimeNow_NoLock(m);
if (m->timenow == 0) m->timenow = 1;
}
else if (m->timenow == 0)
{
LogMsg("%s: mDNS_Lock: m->mDNS_busy is %ld but m->timenow not set", functionname, m->mDNS_busy);
m->timenow = mDNS_TimeNow_NoLock(m);
if (m->timenow == 0) m->timenow = 1;
}
if (m->timenow_last - m->timenow > 0)
{
m->timenow_adjust += m->timenow_last - m->timenow;
LogMsg("%s: mDNSPlatformRawTime went backwards by %ld ticks; setting correction factor to %ld", functionname, m->timenow_last - m->timenow, m->timenow_adjust);
m->timenow = m->timenow_last;
}
m->timenow_last = m->timenow;
Here, m->timenow_last and m->timenow are signed integers. If m->timenow_last is positive enough, and m->timenow is negative enough, then the subtraction will overflow. In C, signed integer overflow is undefined. Depending on the compiler being used and the value of the compiler flags and optimizations, the code may behave as if the if statement will always be true. In this case, the result will be that m->timenow_last = m->timenow.
The MainLoop function works as follows:
for (; 😉
{
…
if (!gotData)
{
mDNSs32 nextTimerEvent = mDNS_Execute(m);
nextTimerEvent = udsserver_idle(nextTimerEvent);
ticks = nextTimerEvent - mDNS_TimeNow(m);
if (ticks < 1) ticks = 1;
}
else// otherwise call EventLoop again with 0 timemout
ticks = 0;
timeout.tv_sec = ticks / mDNSPlatformOneSecond;
timeout.tv_usec = (ticks % mDNSPlatformOneSecond) * 1000000 / mDNSPlatformOneSecond;
(void) mDNSPosixRunEventLoopOnce(m, &timeout, &signals, &gotData);
...
}
If m->timenow_last is ever reaches 0x7fffffff, then this code will loop.