BSense: A Flexible and Open-Source Broadband Mapping Framework

We present, BSense, a flexible broadband mapping system for broadband coverage and quality assessment of broadband connections in a given geographic region. For coverage related analysis, it relies on data that is either obtained from ISPs or generated based on technology models and information about infrastructure sites. Broadband quality assessment in BSense is via host-based measurements using our multi-platform and technology-adaptive software client which periodically runs as a background process on users’ computers. The host-based software measurement approach employed in BSense is not only cost-effective but is also flexible and reduces measurement bias. BSense also incorporates a flexible broadband quality index for summarizing the collective effect of various underlying attributes such as download/upload speeds and latency. BSense system has been implemented based on open-source software components. We conduct extensive evaluations of the measurement component of BSense aimed at quantifying system effects and to compare against other measurement techniques and approaches, the latter showing that BSense is fairly close to sophisticated and expensive alternatives. The usefulness of the BSense system is demonstrated using two real world case studies, one on identifying notspots in Scotland and the other on broadband quality assessment in a rural part of Scotland through pilot deployment.


Introduction
Broadband mapping is the process of assessing broadband coverage, quality and market for a given geographical region (e.g., country, province, city). Broadband coverage assessment is aimed at identifying "notspots", i.e., locations not serviced by any broadband access technology. For areas that are "covered", assessing broadband quality in those areas is more relevant and interesting. Quality is measured using a set of performance metrics such as download/upload speeds, latency, jitter and packet loss rate. Several technology-specific and network provisioning factors affect quality in practice (e.g., line length, number of concurrent users, contention ratio, backhaul capacity). Choice and cost associated with broadband subscription are additional aspects that are of interest for broadband mapping, especially to consumers and regulators. In order to determine the amount of choice that a consumer has, one needs to find out various access technologies and Internet Service Providers (ISPs) available at the consumer's location. Greater choice usually also implies lower cost (per Mbps) for the consumer. Moreover, choice and cost both tend to depend on the coverage and quality aspects -poor broadband coverage or quality in a region correlates well with lack of choice and/or higher costs for consumers in that region. Note that all these four aspects -coverage, quality, choice and cost -vary with time but timescales of change may differ widely between them.
Interest in broadband mapping has been growing recently in tune with increase in consumer awareness and recognition by governments on the importance of high-speed Internet access for all citizens. Different countries have launched national broadband mapping programs (e.g., [1][2][3][4]) to quantify the existing state of broadband delivery and to track the progress towards achieving targets set forth in national broadband plans, especially in view of the on-going debate on the role of public funding and regulation in enabling fast and universal broadband access. Beyond these government-initiated efforts, other broadband mapping examples include [5,6].
Despite these various mapping efforts, we identify the lack of an open and flexible broadband mapping framework, key to effective and consistent mapping exercises. Open specification of assessment methodology and metrics is important for audit, whereas the use of open-source software results in lower cost implementations which in turn enable broadband mapping efforts in developing regions. Flexibility is also important to accommodate diverse broadband access technologies and assessment perspectives, and to factor in latest advances and best practice in terms of measurement techniques. Moreover, the existing broadband mapping efforts take one of two different approaches we refer to as model based (e.g., [1,7,8]) and measurement based (e.g., [5,6,[9][10][11][12]) even though neither is enough -model based approaches fail to capture the discrepancy between the expected/advertised broadband quality (e.g., in terms of download speed) and the actual quality experienced by consumers, whereas measurement based approaches are clearly not useful for identifying notspots.
In this paper, we present an open and flexible broadband mapping framework called BSense (Fig. 1) along with its implementation based on open-source software components 1 .
-BSense incorporates both model based and measurement based approaches keeping in mind the observation that each of those approaches is useful for a different purpose. For coverage related analysis, it relies on data that is either obtained from ISPs or generated based on technology models and information about infrastructure sites (e.g., cell tower locations). Broadband quality assessment in BSense is based on hostbased continuous measurement stream obtained using our multi-platform and technology-adaptive software client, which periodically runs as a background process on users' computers. In essence, the measurement component in BSense is crowd-sourced enabling continual monitoring of fixed wired, wireless and donglebased mobile broadband connections. The measurement paradigm employed in BSense results in a lowercost and flexible alternative to [9,10], and reduces measurement bias compared to [5,11,12]. -Unique to BSense is a flexible broadband quality index for summarizing the collective effect of various underlying attributes such as download/upload speeds and latency. Specifically, we propose to separately model user preference concerning various performance attributes through a specific instantiation from a flexible family of utility functions and then combine them to produce an overall index by leveraging the general framework of multi-attribute utility theory [13].
Our extensive evaluation and validation of the measurement component of BSense shows that it yields similar measurement results as the hardware based measurement approach [9] and a recently proposed sophisticated measurement technique [14]. More crucially, to demonstrate the usefulness of the BSense system, we use two real world case studies -one on identifying notspots in Scotland and the other on broadband quality assessment in a rural part of Scotland through pilot deployment involving 60 real users over a three month period.
The rest of the paper is structured as follows. Different broadband mapping approaches are discussed in the next section. Section 3 provides a detailed desccription of the design and implementation of the BSense system, including its broadband quality index component. Evaluation results and case studies of BSense are presented in Section 4 and Section 6 concludes the paper.

Related work
Broadband mapping approaches can be broadly classified into two categories: (1) model based and (2) measurement based. With the model based approach, broadband coverage and speeds are estimated based on theoretical or empirically derived models of access technologies, knowledge of network infrastructure (e.g., mobile network base stations, locations of phone exchanges) and configurations (e.g., radio parameters, contention ratio). [1] is an example of this approach to estimate 3G mobile broadband coverage. For ADSL, see [7,8] for examples of such models and their use in estimating broadband coverage. Such data is inherently optimistic as it does not consider various practical impediments (e.g., line quality, contention). Nevertheless, it is useful for coverage analysis in the absence of any measurement data. Note that this is the approach followed in [1][2][3][4] using the data obtained from ISPs. Measurement based approaches involve actual measurement of broadband connections, a necessity for assessing broadband quality in a region. None of the existing measurement approaches we are aware of rely on measurement data from ISPs. Measurement based approaches can be further divided into hardware-based and software-based approaches.
The hardware-based measurement approach (also called gateway-based approach) involves deploying a customized hardware box that directly connects to home broadband router for a representative sample of users, and using the gathered statistics across all such boxes to estimate statistics for the whole population. It is pioneered by SamKnows [9] for UK Ofcom and US FCC sponsored broadband speed studies, and is also considered recently in the academic community [10]. This approach can be expensive than an equivalent software based measurement approach. Moreover, it is also limited in terms of flexibility -unviable for mobile/wireless broadband quality assessment and inaccurate for analysis at a fine-grained geographic granularity (e.g., city level) with a measurement campaign originally planned at a coarser level (e.g., nationwide).
Software-based measurement approaches broadly come in three varieties: -Web based: Using web-based speed tests such as Speedtest [5] and NDT [11] can only gather sporadic and geographically non-uniform measurement data. They also suffer from measurement biases (e.g., users taking speed tests may have poor broadband connections or may not belong to a representative sample). -Host daemon based: This approach relies on a measurement agent running in the background on user computers and can overcome the limitations of the webbased approach. However, existing systems following this approach (e.g., [6]) lack openness and flexibility desirable in a broadband mapping system. -Consumer and ISP independent: Dischinger et al. [12] present an interesting approach that does not require cooperation from either the consumers or the ISPs. This approach relies instead on certain specific but standard functionality from routers (e.g., responding with TCP RST packets upon receiving unsolicited ACKs). Such functionality may be disabled due to security concerns. If a particular ISP does not support this functionality on all its broadband routers, then that ISP is effectively ignored by this approach, introducing a measurement bias and thus undesirable from a broadband mapping perspective.
BSense uses two of the above mentioned approaches: (1) a model based approach for coverage related analysis (e.g., to identify notspots); and (2) a host daemon based software measurement approach for broadband quality assessment in a given region. The measurement approach taken in BSense not only allows measurement for fixed wired and wireless broadband connections but also measuring performance of fixed mobile broadband connections like those considered in [15] that are based on USB dongles or integrated 3G/4G modems in portable devices (e.g., laptops and tablets).
Note the measurement component of BSense evolved to its current approach starting from a different approach in its initial version [16]. Specifically, the earlier version of BSense followed a web browser based measurement approach using NDT [11], in much the same way as Speedtest [5] works. The advantages of host daemon based software measurement approach we adopted in the current version of BSense are well articulated in a recent paper [17]. The authors of [17] provide an elaborate classification of measurement approaches for large-scale broadband performance monitoring, identify a comprehensive set of requirements and discuss related standardization activities [18]. Their analysis suggests host daemon based software measurement approach as a most effective one; they then develop a measurement platform called HoBBIT following this approach and present measurement results obtained using HoBBIT. This further justifies the measurement approach we take in BSense.
As most broadband subscriptions are for Internet access at homes, there has also been work recently on measurement of home network performance and configuration. An example is HomeNet Profiler [19], a tool that runs on an end-system connected to the home network and measures aspects such as set of devices, services and characteristics of the WiFi environment. Grover et al. [20] take a different measurement approach using a router with custom firmware and present an empirical study that focuses on varied aspects such as downtime, spectrum usage and traffic profile.
There have been several measurement studies focusing on performance aspects of broadband access networks [10,17,21,22]. For example, [10] presents a large-scale gateway based measurement study of home broadband performance and highlight key influential factors including access technology, traffic shaping policies of ISPs and bufferbloat. In a more recent work, Sundaresan et al. [22] focus on web performance, and using a router based measurement tool for analyzing web page load times called Mirage, show that last-mile latency is a key factor; they also identify home caching as an effective remedy to achieve faster page load times.
Some recent measurement studies have also focused on broadband performance in a developing country context [23,24]. Chetty et al. [23] report a small-scale measurement study of fixed and mobile broadband connections in South Africa and identify some challenges for broadband performance measurements in a developing country including the need to keep the measurement traffic overhead to stay within the data caps and keep costs low, and recruiting and continually engaging participants. They also come up with some interesting findings from their measurement study such as mobile connections being faster than fixed and higher latency to popular websites due to poor ISP-level connectivity between users and servers and further away content hosting sites. Koradia et al. in [24] present a small-scale measurement study of mobile data connectivity performance in (rural) India and similarly to [23] find that latency is an issue, so conclude that service provides should configure their networks to provide lower latencies and content should be hosted closer to users.
We will discuss the work in [25][26][27][28] that is related to the broadband quality index component of BSense later in Section 3.3.

Overview
Broadband mapping is needed by all stakeholders -broadband users, ISPs, policy makers and regulators. As such, an effective broadband mapping framework should engage and involve all these stakeholders. Moreover, broadband coverage and quality varies over time with newer deployments, network upgrades and emergence of new access technologies. This further makes cooperation among different stakeholders necessary for a complete, reliable and an evolving broadband map. With the above in mind, our proposed BSense framework (illustrated in Fig. 1) views broadband mapping as a cooperative exercise involving different stakeholders.
We now outline some incentives for the various stakeholders to continually contribute to the broadband mapping exercise. Broadband users would benefit from comparing their observed broadband quality over time with the advertised service of their ISP. Equally, they would be interested in knowing the broadband coverage and quality in their neighborhood from different ISPs. A user can obtain such information from the mapping system in exchange for installing and running a broadband connection quality measurement software on a home computer. Results of periodic tests continually feed into the measurement database. ISPs normally pay for market research information to determine areas to upgrade their networks and improve their service quality with the view of maintaining and expanding their customer base. They could get such information from the mapping system in return for proactively updating it with data about their "estimated" coverage and speeds at different locations (obtained using models) along with associated information on their various service offerings (packages). Lastly, policy makers and regulators can query the broadband mapping system to get a true picture of broadband coverage and quality in region, and accordingly make informed decisions (e.g., public sector intervention, providing incentives for investments in underserved areas, telecom market regulation); additionally, they maybe the entity owning or funding the mapping initiative as is already the case in the UK, US and Germany [1][2][3][4].
In the following, we elaborate on how BSense framework combines the best aspects of existing mapping approaches using data from ISPs for coverage analysis and relying on user side measurements for quality assessment. As part of measurement data collection, our implementation explicitly considers the diversity of consumer platforms that exist in the real world. A flexible broadband quality index is presented in Section 3.3 to summarize several different attributes (e.g., download speed, upload speed, latency) using a single metric that quantifies the quality of a broadband connection. We will further emphasize the unique aspects of BSense and position it in the wider context in Section 5. Figure 2 depicts the BSense software architecture, which is described in the rest of this section.

BSense server
BSense brings together different types of data. Estimated broadband data from ISPs is fed into the BSense database via webservice API calls. Broadband users (consumers) are the key source of continuous measurement data for the mapping system. This is enabled by a lightweight software agent termed BSense Agent that runs in the background on a user computer and periodically communicates with BSense Test Servers to measure technical attributes of user's broadband characteristics such as download speed, upload speed and latency. Digital geographic data from country-specific sources and demographic data from population census are additionally used as layers underneath estimated or measured broadband statistics to generate broadband coverage or quality maps as needed. Our database implementation in BSense uses the open source PostgreSQL 2 database management system augmented with the PostGIS 3 extensions to handle spatial data.
BSense database schema is shown in Fig. 3. As broadband mapping is usually carried out at some geographic granularity (e.g., national-level, state-level), geographic units play a key role in the broadband mapping data. We assume that geographic region of interest is organized into distinct GeoUnits, each with an associated unique ID, name and boundaries. Fine grained GeoUnits may in turn belong to several coarser GeoUnits. We incorporate such hierarchical geographic subdivisions via the GeoUnitTiers tables. Taking Scotland as an example, postcode would be a suitable candidate for fine-grained geographic unit 4 ; there are 152,000 postcodes in total. Postcodes in turn are aggregated into 1222 Scottish Census Area Statistics (CAS) wards with a minimum size of 50 residents and 20 households. CAS wards are further aggregated into 32 council areas of varying size. Demographic statistics for a geographic unit are stored in GeoUnitStatistics. In the current implementation, we successfully imported spatial data for postcodes, wards and councils for Scotland from governmental sources into BSense, and population data from 2001 census of Scotland.
The BSense server side includes a web server for hosting a public website that users can access for registering and downloading the BSense agent software and subsequently to retrieve their broadband connection statistics.  The web server also supports a set of web service API calls over SOAP for interaction between the BSense system and various stakeholders. The current API consists of the following calls:

GeoUnitStatistics
-BroadbandTestRecord() called by the BSense agent every time a measurement test is completed. It records the results into RawMeasurementData table in the database. -AddPackage(), EditPackage() and Delete-Package() are used by participating ISPs to manage their broadband service packages stored in the database. -AddEstimatedData(), EditEstimatedData() and DeleteEstimatedData() called by ISPs to update the estimated broadband data for each covered GeoUnit by any of their service packages.
-LookUpMappingData() invoked by users via the public website, and by ISPs and policy makers using SOAP calls to query the BSense system.
These API calls are handled by a server side component that enforces security and access control, validating the input and checking whether an API call is made by a party with the required permissions.
In addition to SOAP based web services, BSense provides external access via the Open Geospatial Consortium's standard WMS (Web Map Service) and WFS (Web Feature Service) to obtain raster and vector geo-referenced images, respectively, of a geographical area of interest; most opensource and commercial GIS software products can directly use WMS and WFS services. BSense also provides a in-built web application based on WMS, developed using open-source GeoExt 5 and OpenLayers 6 frameworks, to further ease access to broadband maps and their visualization. Access to the BSense database via WMS and WFS is enabled by the well known open-source GeoServer 7 , a Java software that allows users to view and edit geospatial data.

BSense agent
Each broadband user participating in the BSense based mapping exercise runs a software agent (BSense Agent) that facilitates continuous and cost-effective measurement of the user's broadband connection. As such, the agent is a key element of the BSense framework for gleaning the quality of broadband provisioning in a given region. Given the diversity of operating system (OS) platforms used by consumers in the real world, the agent should function on different commonly used platforms to avoid measurement bias. The BSense agent was designed explicitly keeping in We first give a high-level overview of the measurement process. A participating broadband user would download the agent from a public website like the one we developed -http://broadbandforall.net -and install it on the user's home computer. The agent runs in the background and periodically wakes up to perform a measurement test of the user's broadband connection. The time interval between measurements (i.e., measurement frequency) is a customizable parameter whose setting is a tradeoff between gathering fine-grained measurement samples over time and measurement overhead. Each measurement test consists of the following sequence of steps ( Fig. 4): 1. The agent queries BSense server to get the details of the measurement test to be performed. 2. BSense server replies with an "experiment definition" (elaborated below) as well as details for a test server to be used (e.g., IP address, port number). 3. BSense server also simultaneously notifies the test server about the impending measurement test from the user's agent. 4. The agent interprets and follows the experiment definition received, generating the traffic flow requested and/or receiving the incoming traffic to/from the specified test server. 5. Upon test completion, the agent summarizes the traffic traces from the test and uploads it to the BSense server.
Since the effectiveness of the mapping framework improves with a larger number of participating users, the mapping system should be scalable and robust to server failures. In the case of BSense server, this can be achieved through the use of a server farm (the current approach) or by installing the BSense server on a cloud-based hosting. The use of multiple test servers as in our current design also contributes towards scalability and fault tolerance. As regards to the location of test servers, we advocate their deployment at neutral Internet exchange points (IXPs) (e.g., the ones listed at http://www.euro-ix.net/) to avoid introducing bias against users of some ISPs. In our current implementation, however, test servers are co-located with the BSense server farm.
BSense could in principle use any multi-platform network performance measurement tool. As our focus is not on new measurement techniques but rather on developing a open and flexible framework for broadband mapping, we opt to use an existing performance measurement tool. Specifically, we choose the widely used traffic generator called D-ITG [29] in our implementation as it has several attractive features such as the following: open source; can be made to work on different platforms and behind most common types of NATs with minimal effort; and provides a high degree of flexibility when it comes to traffic generation. On the user side, D-ITG client is wrapped inside the BSense agent. Note that BSense design is flexible enough to allow replacing D-ITG with any other measurement tool (e.g., [14]).
We now briefly describe the experiment definition sent to the user-side agent every time it is about to do a measurement test. An experiment is defined as a set of traffic session specifications with each session potentially consisting of multiple concurrent or partially overlapping flows. Specifically, each experiment in our context is a sequence of three traffic sessions: initial ping-like UDP session with short packets to measure latency, jitter and packet loss rate, followed by an upstream traffic session and then a downstream traffic session. The parameters for these sessions are set to the following default values: initial UDP session with one bidirectional flow with 56 byte packets and 10 packets/second for 60 seconds; upstream traffic session: 8 concurrent UDP/TCP flows with 1024 byte packets at 400 packets/second for 15 seconds; downstream traffic session with similar parameters as the upstream session. We consider two experiment definitions corresponding to using either TCP or UDP and alternate between them every 15 minutes (the default time between measurement tests in our implementation).
Note that it is straightforward in our design to have new experiment definitions and assign different experiment definitions to different user agents if needed. Also note that in a host-based software measurement approach like ours, it is possible to have in-home wireless network use and other active use of the broadband connection from other devices within the home concurrently with broadband connection measurement traffic. However, the problem of cross-traffic on the network path is unavoidable with end-toend performance measurement even with hardware-based measurement approach. We believe it is best dealt with via statistical filtering.

Broadband quality index
As noted at the outset, coverage and quality are the two key aspects of interest for broadband mapping. While broadband coverage in a particular location can be quantified as a binary variable, the same is not true for broadband quality as the latter is dependent on several underlying technical attributes such as download speed, upload speed and roundtrip latency. Due to the lack of standard ways to summarize the collective impact of those several attributes, the focus is often solely on download speeds even though it is widely recognized that other metrics such as upload speeds and latency are also important. This issue has been discussed at length by Sundaresan et al. [27] where they propose the notion of "network nutrition label" which contains comprehensive information about various network metrics of a broadband connection like throughput, latency and loss rate. They advocate the use of nutrition labels not just to assess the quality of a broadband connection but also to help the user evaluate an ISP service plan before subscribing to it. Of particular relevance to our work, the authors of [27] highlight and discuss (but do not address) the challenge of translating low-level network metrics to higher-level metrics that are more meaningful to users.
Defining an index is a common approach to deal with problems of the above nature. Only work we are aware of that tries to address the issue of developing a broadband quality index is surprisingly a sociological study [25] relying on expert surveys to determine the relative importance of various technical attributes. More importantly, the work in [25] only provides a very specific approach to defining the broadband quality index while we are interested in defining a more flexible and general framework. [26,28] make up the other related work on this issue as far as we know. The authors in [26] design a tool called HostView to get network performance data that is annotated with users' perception of network quality via a low overhead user experience sampling algorithm; such annotated data they argue is key to designing mechanisms to detect network performance degradations that affect users' quality of experience. The same authors in [28] use the data collected with HostView to characterize the performance of networked applications in different environments and find that application mix of a user and environmental factors (e.g., access network / interface type) respectively have a big influence on data rates and round-trip times.
Our main idea for designing a suitable index is to model each attribute impacting broadband quality as a utility function and then draw upon the multi-attribute utility theory (MAUT) [13] to define the broadband quality index (BQI) as a composite function of utility function values for the individual attributes.
In our model, denote F = [f 1 , ..., f n ] for the set of network attributes (features) to be included in the BQI, covering important attributes characterizing a broadband connection. For the sake of concreteness, we focus on three key performance attributes in this paper: download speed (d, in Mbps), upload speed (u, in Mbps) and round-trip latency (l, in milliseconds).
We first aim to identify a suitable family of realistic single-attribute utility (SAU) functions for modeling user preferences about individual attributes, and then consider their composition into a multi-attribute function. For the attributes under consideration (i.e., download speed, upload speed and latency), we observe that sigmoid functions, whose graphs are "S-shaped" curves (see Fig. 5), better reflect user satisfaction. This is because improvement in utility from improving any of these attributes beyond a point is marginal. Equally, when these attributes are below a certain threshold for speeds and above a certain threshold for latency, the change in utility is again marginal. In between these extremes, the improvement in utility with The sigmoid utility function can be seen as a transfer function between a given attribute f and the perceived utility associated with specific values of f . In fact, the function shown is a modified sigmoid function given in Eq 1 to realize zero utility when the value of f is zero improvement for any of these attributes is noticeable and substantial.
Thus, we define a set of utility functions u f (f ), each defined on a given attribute f ∈ F as: where parameters a and b determine the nature of each utility function curve. A pragmatic approach to specify a utility function for each attribute would be to have the BSense administrator pick two strategic values of each attribute f and provide their corresponding utility u values: These two points can be carefully picked so that they represent the utility of low-end and high-end broadband connections (e.g., with u o = 0.2 and u * = 0.8 or u o = 0.1 and u * = 0.9). Intuitively, the lower knee in the curve represent the value of the attribute which is deemed as insufficient, and the upper knee describes the attribute value which is good enough for the service. As a consequence, "poor" broadband connections that are only able to offer attribute values (e.g., download speeds) below the lower point offer only marginal utility to the users. Similarly, the incremental utility above the upper threshold is also marginal. The low and high values can, for example, be based on current policies and regulations (e.g., the "Universal Service Obligation" to set the bottom bar that ISPs have to provide and the policy maker must enforce) or current state-of-the-art (e.g., the fastest commercially available service to set the top bar).
The parameters a and b for each of our SAU functions can then be derived from their two specified corresponding 'knobs' by solving the pair of equations obtained by substituting values from Eq. 2 in Eq.1.
To provide a quality index for each broadband connection, we first need to generate a summary statistic from all the measurements that have been gathered by BSense for that connection. In the current implementation, we use the median values of download speed, upload speed and latency. These values along with their respective a and b parameter values as input to Eq. 1 determine the utilities of the broadband connection in question with respect to each of those three attributes.
Multi-attribute utility theory assists us in combining the various SAU functions in a single equation, whose form depends upon the particular independence conditions fulfilled by the different SAU functions. For simplicity, we assume mutual additive independence in this paper. Then the resulting multi-attribute utility function can be represented as: A further simplifying assumption would be to have all scaling constants (weights) k f to be equal. This is reasonable given that our main purpose is to demonstrate the value of multi-attribute utility theory and utility functions in providing a flexible framework for defining broadband quality index. A more general approach would be to set these weights based on the usage profile (including application mix) of a user. A user with predominantly web traffic would assign a greater weight for download speed whereas another user who mainly accesses cloud based services would value upload speed more. Similarly a user who uses the Internet primarily for gaming applications would value latency. Since it is usually the case that a user exhibits a combination of such characteristics and more so at a household level with multiple users with diverse usage patterns, characterizing the usage profile of a user/household would be a good way to come up with the weights in Eq. 3 in practice. Such a characterization can be done experimentally based on measurements taken at the home gateway or broadband router but it is beyond the scope of this paper.

Evaluation
In this section, our goal is two-fold: (1) to evaluate the measurement methodology adopted in BSense under different conditions; (2) present two case studies using BSense to demonstrate its value for broadband mapping based investigations aimed at coverage analysis and quality assessment, respectively.

BSense measurement methodology: operating system impact
BSense relies on various Operating System (OS) calls to manage timers, access the real-time clock, the network and the local filesystem. It is also affected by OS process scheduling. Since BSense agent software would be deployed on different OS platforms, it is useful to characterize the OS impact on performance measurement results.
To meet this goal, we perform a controlled lab experiment as described below. We setup a small testbed as shown in Fig. 6. A BSense test server was installed on a Linux computer whose configuration is similar to those used in our real-world evaluations described in later sections. The server is connected over a 1Gbps full-duplex Ethernet link to a second computer, which is configured to run any of the three OS platforms (Microsoft Windows, Apple Mac OS X and Ubuntu Linux) as well as BSense agent on those platforms. Measurement test between agent and server follows the experiment definitions and default settings described in Section 3.2.2. On the network segment between the server and agent, we inserted a third Linux machine that acts as a router, performing NAT for the traffic going to/from the agent. The router also runs the dummynet tool [30] to allow us to emulate network characteristics of various typical broadband connections. We selected the following five scenarios: Configurations for scenarios (b)-(e) are influenced by common broadband connection types in rural Scotland, the setting for our other evaluations. In the UK, traditional 8Mbps ADSL lines are marketed under the name of "ADSL Max", whereas "Exchange Activate" ADSL lines are available only in rural areas; configuration for the satellite connection reflects the Government-subsidized service for residents in remote and rural parts of Scotland. Figure 6 shows the result of running measurement tests using the three different OS platforms for the computer where agent software resides. Linux platform is arbitrarily chosen as the reference and relative differences with the other two platforms are shown. Differences for upstream/downstream speeds for different scenarios is mostly around 1 % and never exceeds 2 %. Relative error for the delay is within 4 %, higher for the Windows platform as it slightly overestimates the round-trip delay. From these results, we conclude that the OS impact on our measurement methodology is minimal under ideal conditions.

BSense measurement methodology: real world validation
Real world is much less ideal with several factors differing between different broadband connections even when using the identical host platform and hardware -broadband router hardware and configurations; access technologies and other differences on the end-to-end path. We characterize the result of using BSense measurement methodology under real world conditions in comparison with several alternative measurement tools though some of them are not readily suitable for continuous measurement on different OS platforms (e.g., NDT).
Like in earlier controlled lab experiments, we consider several common broadband scenarios now also including cable and wireless technologies. Specifically, we consider 6 representative users as follows: -3 broadband users on "unlimited 24Mbps" contracts with different ISPs connected over ADSL2+ connections to three different phone exchanges; -a "Cable" user on a 20Mbps downstream, 768Kbps upstream contract; -one user of our Tegola [31] testbed in rural Scotland that doubles as a broadband wireless access network for communities in the testbed area; the user's Internet connection is over multihop wireless path, including long-distance point-to-point 5.8GHz wireless links; -a remote user connected via earlier mentioned Government-subsidized satellite connection with speeds limited to 512Kbps downstream and 256Kbps upstream.
Each of these users was given a Linux laptop preconfigured to run back-to-back measurements using a suite of software measurement tools when connected over Ethernet to broadband router at user premises. Measurements for each user are between the laptop and a shared server machine (across various tools and users) on the Internet that we setup.
Upstream and Downstream speeds were measured using the following on laptops handed to users: -BSense agent (based on the D-ITG tool), using both UDP and TCP-based tests; -ShaperProbe bandwidth estimation tool [14]; -NDT [11], which is one of the tools used for FCC consumer broadband measurement tests in the US 9 .
The techniques underlying each of these tools are markedly different. Measurement using our BSense agent is based on using multiple parallel streams of TCP or UDP  Fig. 6 Performance comparison across different OS platforms (with respect to the Linux platform) in controlled lab experiments traffic to saturate the access path under the assumption that the access tier is the bottleneck. We use multiple connections to prevent the agent or server from becoming a bottleneck instead of the access path -because of the known limitations of TCP receive window mechanism 10 , a single connection may be unable to exploit the full speed available. Multiple simultaneous traffic flows are representative of many popular Internet applications (e.g., web browsing) and do not penalize the overall speed result. ShaperProbe estimates the upstream and downstream capacities using packet trains of back-to-back packets over UDP: K packet trains, each composed of L packets of size S are sent over the network. The receiver measures their dispersion and calculates the path capacity as C a = (L−1)S . The median value of C a is given in output as a result. NDT uses packet dispersion techniques, measuring the inter-packet arrival times for all data and ACK packets sent or received. By also taking packet size into account, it can calculate the speed for each pair of packets sent or received. 10 The TCP receive window is used by the receiver to tell the sender the buffer size available to store incoming data. The TCP window scale option, described in RFC 1323, is needed when the bandwidth-delay product is greater than 64K. If not supported or enabled, the achievable throughput of a single TCP connection may be limited (e.g., in case of a 80ms link, it cannot exceed 6.55 Mb/s).
For RTT latency measurements, we compared BSense agent with the NDT tool and ping command line utility (invoked with the default parameters). All measurements were carried out several times, each time back-to-back for the compared alternatives, for at least 6 hours at night-time without any other local network activity to keep measurement related noise low and allow fair comparison. Figures 7 and 8 show the results for downstream and upstream speeds, respectively, obtained using different tools and across different types of user connections. Measurement results for each tool are shown as a CDF. NDT seems to consistently underestimate speeds, an observation also made in [32] and which may be due to the use of a single TCP connection for speed measurement. We can also observe that UDP speeds with the BSense agent are slightly higher values than those based on TCP, which is expected because of the inherent TCP overheads (e.g., due to the slow-start mechanism, the recovery time from packet losses and the additional header size). And speed results with BSense UDP are quite close to the ShaperProbe method -median UDP upload/download speeds are always within 5 % of the value reported by ShaperProbe. The relatively higher speed values with ShaperProbe can be explained by the fact that it more precisely estimates upstream/downstream capacities via packet trains, unlike the application-layer throughput Results with the satellite connection are somewhat different from the rest which we believe is due to the dynamics of the satellite link itself and its interaction with TCP behavior; this is further supported by the fact that NDT also fares better over this connection.
The latency measurement comparisons, shown in Fig. 9, seem to suggest that BSense provides very similar results to those obtained using the ping command -in three scenarios the differences are within a few milliseconds. This result is reasonable given that BSense measures latency using a bidirectional UDP flow with similar sized packets (56 bytes) as that of ICMP ping echo−reply packets. NDT, on the other hand, tends to consistently overestimate the latency, possibly because of its use of average of all TCP round trip delay samples to estimate round trip latency for the connection. The satellite connection is again a bit of an exception for which same explanation as above may hold.

Comparison with hardware-based measurement approach
We have also compared the software-based measurement approach used in BSense against the hardware-based approach employed by SamKnows [9]. For this comparison, we deployed a laptop running the BSense agent at a user's home which also had the SamKnows measurement box connected directly to the user's broadband router; the agent on the laptop measured the broadband connection characteristics periodically by communicating with the BSense test server on the Internet. Summary of daily measurements collected by the SamKnows box were retrieved from the web based dashboard accessible to the user. Figure 10 compares daily maximum speeds and minimum latencies measured using both the methodologies over a two week period shown as CDFs. We find that speed measurements in both cases are fairly similar with less variation whereas latency with the SamKnows case is a bit lower possibly because it does measurements when the broadband connection is unused and also possibly because its test server is located at a different location from that of BSense. Note that we did not have control over the test server location for the SamKnows case.

Broadband coverage analysis for scotland
In this case study we show the benefit of BSense for understanding broadband coverage, using Scotland as the setting. Such studies would rely upon estimated coverage and speed data from ISPs whenever available. For this study, we mimicked the way ISPs would contribute to the BSense mapping system by trawling through the public websites of different ISPs to determine whether an ISP covers a particular postcode and if so, the estimated download speeds from the ISP's viewpoint, for each of the 152,000 postcodes in Scotland. This information is then fed into the BSense estimated database via the web service API calls. Figure 11 shows the broadband coverage in Scotland for different access technologies based on the estimated data from ISPs collected as described above. For the 3G mobile broadband case, we show data for only one network operator for clarity but the coverage for other mobile network operators is similar. From these maps, we observe that ADSL is the dominant access technology with cable and mobile confined mainly to population centers in the central belt and north east. Now focusing on ADSL alone, we estimate the notspots in Scotland with respect to a threshold download speed. Specifically, a postcode area is considered to be a notspot if ADSL service with estimated download speed above the specified threshold cannot be supported within that area. This may be because residences in the postcode area are too far away from their nearest phone exchanges, for example. We consider three different threshold values (512Kbps, 2Mbps, 8Mbps). Resulting notspot maps produced using BSense are shown in Fig. 12. It can be clearly seen that most postcode areas outside of the central belt of Scotland (with the two main cities of Edinburgh and Glasgow and having the largest population concentration) become notspots as the threshold is increased. While it is true that satellite based broadband covers virtually the whole of Scotland, the large round-trip latencies associated with the satellite technology (as shown using measurements in the next case study) makes it less attractive.

Broadband quality measurement: a pilot study
In this case study, we assess the broadband quality in a rural part of Scotland. Specifically, we focus on the area around the Isle of Skye located in the northwest of Scotland. We also consider the neighboring archipelago of the 'Small Isles' and the mainland rural areas of Glenelg and Knoydart peninsulas. This region is quite diverse in terms of demographics, terrain and broadband service provisioning, making it a well suited region for our broadband performance measurement study. It has a population of around 10 thousand people spread across a handful of small towns, several  small villages and scores of isolated dwellers in the farming lands. Several different access technologies used for broadband provisioning. In total, 15 phone exchanges are located in the area. Although every resident has access to a landline, broadband connection types vary. A few phone exchanges are enabled for ADSL2/ADSL2+, which is available only from the telecom incumbent (BT). Other exchanges offer ADSL (8Mbps download speed) service, and a few are enabled only for "Exchange Activate" (512Kbps) ADSL service. There are no FTTH deployments in the area, and cable and 3G coverage are non-existent. Due to a recent broadband reach initiative from the Scottish government, (a) Threshold=0.5Mbps (b) Threshold=2Mbps (c) Threshold=8Mbps Fig. 12 BSense generated map of notspots in Scotland that lack an ADSL broadband service supporting download speed greater than the indicated threshold. Notspot postcode areas are shaded in red some of the users in rural and remote areas in previously notspot areas in Scotland, including those in our study area, now connect via subsidized yet relatively expensive satellite connections. In addition, residents in a small part of this area connect via Tegola, an experimental/community long distance WiFi network we have deployed six years ago [31]. Through publicity of our pilot broadband quality assessment initiative via email, local press and word of mouth, we managed to find 60 volunteers in the area who were willing to install and run our BSense agent software. Half of these users are connected to the Internet via ADSL lines to different exchanges and differing line lengths, whereas 18 users connected via our Tegola network; remaining volunteers used satellite connections. Over a 3-month period, we measured the broadband connections of each of the volunteer users, keeping track of median values of download/upload speeds and latency measurements for each user. We collected around 40,000 measurements in total.
To study the broadband quality index across users and access technologies, we used the following parameter settings for the individual utility functions (see Section 3.3), all reasonable given the type of broadband connections in the study area: -Download speeds. Low-end: 2Mbps with a utility of 0.1. High-end: 24Mbps with a utility of 0.9.
-Upload speeds. Low-end: 1Mbps with a utility of 0.1. High-end: 5Mbps with a utility of 0.9. -Latency speeds. Low-end: 200ms with a utility of 0.1.
High-end: 20ms with a utility of 0.9. Figure 13 shows the results. The top graphs show the utility function values for each of the three performance attributes. Each data point in the plots corresponds to a user with the color of the data points indicating the access technology used. Clearly and as expected, satellite users have poor utility values and are clustered together at the worst extreme. Wireless users on the Tegola network, on the other hand, not only experience high speeds exceeding 20Mbps but also are subject to greater variability in speeds because of the shared nature of access. ADSL users also exhibit greater variability in speeds like wireless users but because of different reasons -due to differences in broadband capabilities of the associated phone exchanges and differences in line lengths; most ADSL users fall in between satellite users and wireless users in this area. The bottom graph in Fig. 13 shows the combined effect of the three attributes. For ease of interpretation, we scale up each user's index value to a percentage value between 0 % and 100 %. Results for different access technologies shown as CDFs follow directly from the top graphs given our choice for the multi-attribute utility function (an equal weighted sum of individual attribute utilities).  When index values for different users are geographically rendered on a map, however, the result is quite revealing (see Fig. 14). Here coloring is done at the ward levelall users belong to a ward (an area with about 50 residents and 20 households) are aggregated together. We observe that remote parts of Knoydart and the Small Isles (colored in red) fare poorly, whereas adjacent ward above Knoydart has the best index as a result of high-speed wireless connections from the Tegola network. Wards on the Isle of Skye have intermediate index values as it mainly consists of ADSL users.

Discussion
Results from the previous evaluation section on profiling and performance benchmarking validate the current implementation of the BSense measurement component. The two case studies demonstrate the utility of BSense framework for real-world broadband coverage and quality assessments; the pilot study also shows the value of the broadband quality index as a promising approach to summarize the collective impact of multiple underlying network metrics into a form that users can rely on.
More crucially, the unique contribution of BSense in our view is a flexible broadband mapping software architecture that combines broadband data with demographics and geographic data. This in turn enables analysis of coverage / notspots as well as broadband quality via continual measurement from end-user devices. Overall, BSense can be seen as bringing together the capabilities of tools and systems that either focus solely on broadband coverage/availability mapping (e.g., [1][2][3]) or those that focus exclusively on measurements (e.g., [10,18,21]). Moreover, the measurement component in BSense is designed with flexibility in mind to allow for easier transition to other measurement schemes (e.g., [14]). Also as noted earlier, the current measurement approach used in BSense is a result of experience and lessons concerning the limitations of the web based measurement approach used in the initial version [16].

Conclusions
In this paper, we have developed a flexible framework for broadband mapping called BSense that incorporates both model based broadband coverage data and broadband performance measurement data from users. BSense framework also incorporates a flexible specification of broadband quality index based on utility functions and multi-attribute utility theory. We have implemented BSense using opensource tools and use it to demonstrate the value of BSense approach for broadband coverage and quality assessment with two real-world case studies. Our future work will focus on enhanced measurement techniques that are robust to various sources of variability. We would like to also enhance the broadband quality index by considering additional attributes, variability of each attribute and relationships among various attributes. Finally, we would like to extend BSense for mobile broadband mapping.