From 6f0ec08c47150dca5986ba24292abcd4eafce685 Mon Sep 17 00:00:00 2001 From: Simone Margaritelli <evilsocket@gmail.com> Date: Thu, 3 Oct 2019 22:31:27 +0200 Subject: [PATCH] documented the reward function (closes #50) --- docs/usage.md | 99 ++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 98 insertions(+), 1 deletion(-) diff --git a/docs/usage.md b/docs/usage.md index 7d6c5d8..e9777bc 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -1,4 +1,101 @@ -### UI +### Usage + +At its core Pwnagotchi is a very simple creature: we could summarize its main algorithm as: + +```python +# main loop +while True: + # ask bettercap for all visible access points and their clients + aps = get_all_visible_access_points() + # loop each AP + for ap in aps: + # send an association frame in order to grab the PMKID + send_assoc(ap) + # loop each client station of the AP + for client in ap.clients: + # deauthenticate the client to get its half or full handshake + deauthenticate(client) + + wait_for_loot() +``` + +Despite its simplicity, this logic is controlled by several parameters that regulate the wait times, the timeouts, on which channels to hop and so on. + +From `config.yml`: + +```yaml +personality: + # advertise our presence + advertise: true + # perform a deauthentication attack to client stations in order to get full or half handshakes + deauth: true + # send association frames to APs in order to get the PMKID + associate: true + # list of channels to recon on, or empty for all channels + channels: [] + # minimum WiFi signal strength in dBm + min_rssi: -200 + # number of seconds for wifi.ap.ttl + ap_ttl: 120 + # number of seconds for wifi.sta.ttl + sta_ttl: 300 + # time in seconds to wait during channel recon + recon_time: 30 + # number of inactive epochs after which recon_time gets multiplied by recon_inactive_multiplier + max_inactive_scale: 2 + # if more than max_inactive_scale epochs are inactive, recon_time *= recon_inactive_multiplier + recon_inactive_multiplier: 2 + # time in seconds to wait during channel hopping if activity has been performed + hop_recon_time: 10 + # time in seconds to wait during channel hopping if no activity has been performed + min_recon_time: 5 + # maximum amount of deauths/associations per BSSID per session + max_interactions: 3 + # maximum amount of misses before considering the data stale and triggering a new recon + max_misses_for_recon: 5 + # number of active epochs that triggers the excited state + excited_num_epochs: 10 + # number of inactive epochs that triggers the bored state + bored_num_epochs: 15 + # number of inactive epochs that triggers the sad state + sad_num_epochs: 25 +``` + +There is no optimal set of parameters for every situation: when the unit is moving (during a walk for instance) smaller timeouts and RSSI thresholds might be preferred +in order to quickly remove routers that are not in range anymore, while when stationary in high density areas (like an office) other parameters might be better. +The role of the AI is to observe what's going on at the WiFi level, and adjust those parameters in order to maximize the cumulative reward of that loop / epoch. + +#### Reward Function + +After each iteration of the main loop (an `epoch`), the reward, a score that represents how well the parameters performed, is computed as +(an excerpt from `pwnagotchi/ai/reward.py`): + +```python +# state contains the information of the last epoch +# epoch_n is the number of the last epoch +tot_epochs = epoch_n + 1e-20 # 1e-20 is added to avoid a division by 0 +tot_interactions = max(state['num_deauths'] + state['num_associations'], state['num_handshakes']) + 1e-20 +tot_channels = wifi.NumChannels + +# ideally, for each interaction we would have an handshake +h = state['num_handshakes'] / tot_interactions +# small positive rewards the more active epochs we have +a = .2 * (state['active_for_epochs'] / tot_epochs) +# make sure we keep hopping on the widest channel spectrum +c = .1 * (state['num_hops'] / tot_channels) +# small negative reward if we don't see aps for a while +b = -.3 * (state['blind_for_epochs'] / tot_epochs) +# small negative reward if we interact with things that are not in range anymore +m = -.3 * (state['missed_interactions'] / tot_interactions) +# small negative reward for inactive epochs +i = -.2 * (state['inactive_for_epochs'] / tot_epochs) + +reward = h + a + c + b + i + m +``` + +By maximizing this reward value, the AI learns over time to find the set of parameters that better perform with the current environmental conditions. + +### User Interface The UI is available either via display if installed, or via http://pwnagotchi.local:8080/ if you connect to the unit via `usb0` and set a static address on the network interface (change `pwnagotchi` with the hostname of your unit).