documented the reward function (closes #50)
This commit is contained in:
parent
04b8b91df2
commit
6f0ec08c47
@ -1,4 +1,101 @@
|
|||||||
### UI
|
### Usage
|
||||||
|
|
||||||
|
At its core Pwnagotchi is a very simple creature: we could summarize its main algorithm as:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# main loop
|
||||||
|
while True:
|
||||||
|
# ask bettercap for all visible access points and their clients
|
||||||
|
aps = get_all_visible_access_points()
|
||||||
|
# loop each AP
|
||||||
|
for ap in aps:
|
||||||
|
# send an association frame in order to grab the PMKID
|
||||||
|
send_assoc(ap)
|
||||||
|
# loop each client station of the AP
|
||||||
|
for client in ap.clients:
|
||||||
|
# deauthenticate the client to get its half or full handshake
|
||||||
|
deauthenticate(client)
|
||||||
|
|
||||||
|
wait_for_loot()
|
||||||
|
```
|
||||||
|
|
||||||
|
Despite its simplicity, this logic is controlled by several parameters that regulate the wait times, the timeouts, on which channels to hop and so on.
|
||||||
|
|
||||||
|
From `config.yml`:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
personality:
|
||||||
|
# advertise our presence
|
||||||
|
advertise: true
|
||||||
|
# perform a deauthentication attack to client stations in order to get full or half handshakes
|
||||||
|
deauth: true
|
||||||
|
# send association frames to APs in order to get the PMKID
|
||||||
|
associate: true
|
||||||
|
# list of channels to recon on, or empty for all channels
|
||||||
|
channels: []
|
||||||
|
# minimum WiFi signal strength in dBm
|
||||||
|
min_rssi: -200
|
||||||
|
# number of seconds for wifi.ap.ttl
|
||||||
|
ap_ttl: 120
|
||||||
|
# number of seconds for wifi.sta.ttl
|
||||||
|
sta_ttl: 300
|
||||||
|
# time in seconds to wait during channel recon
|
||||||
|
recon_time: 30
|
||||||
|
# number of inactive epochs after which recon_time gets multiplied by recon_inactive_multiplier
|
||||||
|
max_inactive_scale: 2
|
||||||
|
# if more than max_inactive_scale epochs are inactive, recon_time *= recon_inactive_multiplier
|
||||||
|
recon_inactive_multiplier: 2
|
||||||
|
# time in seconds to wait during channel hopping if activity has been performed
|
||||||
|
hop_recon_time: 10
|
||||||
|
# time in seconds to wait during channel hopping if no activity has been performed
|
||||||
|
min_recon_time: 5
|
||||||
|
# maximum amount of deauths/associations per BSSID per session
|
||||||
|
max_interactions: 3
|
||||||
|
# maximum amount of misses before considering the data stale and triggering a new recon
|
||||||
|
max_misses_for_recon: 5
|
||||||
|
# number of active epochs that triggers the excited state
|
||||||
|
excited_num_epochs: 10
|
||||||
|
# number of inactive epochs that triggers the bored state
|
||||||
|
bored_num_epochs: 15
|
||||||
|
# number of inactive epochs that triggers the sad state
|
||||||
|
sad_num_epochs: 25
|
||||||
|
```
|
||||||
|
|
||||||
|
There is no optimal set of parameters for every situation: when the unit is moving (during a walk for instance) smaller timeouts and RSSI thresholds might be preferred
|
||||||
|
in order to quickly remove routers that are not in range anymore, while when stationary in high density areas (like an office) other parameters might be better.
|
||||||
|
The role of the AI is to observe what's going on at the WiFi level, and adjust those parameters in order to maximize the cumulative reward of that loop / epoch.
|
||||||
|
|
||||||
|
#### Reward Function
|
||||||
|
|
||||||
|
After each iteration of the main loop (an `epoch`), the reward, a score that represents how well the parameters performed, is computed as
|
||||||
|
(an excerpt from `pwnagotchi/ai/reward.py`):
|
||||||
|
|
||||||
|
```python
|
||||||
|
# state contains the information of the last epoch
|
||||||
|
# epoch_n is the number of the last epoch
|
||||||
|
tot_epochs = epoch_n + 1e-20 # 1e-20 is added to avoid a division by 0
|
||||||
|
tot_interactions = max(state['num_deauths'] + state['num_associations'], state['num_handshakes']) + 1e-20
|
||||||
|
tot_channels = wifi.NumChannels
|
||||||
|
|
||||||
|
# ideally, for each interaction we would have an handshake
|
||||||
|
h = state['num_handshakes'] / tot_interactions
|
||||||
|
# small positive rewards the more active epochs we have
|
||||||
|
a = .2 * (state['active_for_epochs'] / tot_epochs)
|
||||||
|
# make sure we keep hopping on the widest channel spectrum
|
||||||
|
c = .1 * (state['num_hops'] / tot_channels)
|
||||||
|
# small negative reward if we don't see aps for a while
|
||||||
|
b = -.3 * (state['blind_for_epochs'] / tot_epochs)
|
||||||
|
# small negative reward if we interact with things that are not in range anymore
|
||||||
|
m = -.3 * (state['missed_interactions'] / tot_interactions)
|
||||||
|
# small negative reward for inactive epochs
|
||||||
|
i = -.2 * (state['inactive_for_epochs'] / tot_epochs)
|
||||||
|
|
||||||
|
reward = h + a + c + b + i + m
|
||||||
|
```
|
||||||
|
|
||||||
|
By maximizing this reward value, the AI learns over time to find the set of parameters that better perform with the current environmental conditions.
|
||||||
|
|
||||||
|
### User Interface
|
||||||
|
|
||||||
The UI is available either via display if installed, or via http://pwnagotchi.local:8080/ if you connect to the unit via `usb0` and set a static address on the network interface (change `pwnagotchi` with the hostname of your unit).
|
The UI is available either via display if installed, or via http://pwnagotchi.local:8080/ if you connect to the unit via `usb0` and set a static address on the network interface (change `pwnagotchi` with the hostname of your unit).
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user