Features

P2P Fortytwo P2P App Fortytwo P2P CLI Fortytwo P2P Container There are currently three ways to run the Inference Node, each one is slightly different and better suited for particular uses. Choose to install what fits you best by referencing the table below.

	Fortytwo P2P App	Fortytwo P2P Container	Fortytwo P2P CLI
Runner tier	Beginner	Docker user	Console user
Best for	Personal device	Server/VM	Personal device
Interaction type	GUI	Console	Guided console
OS
Nvidia GPU	Supported	Required	Supported
Apple Silicon	Supported	✖	Supported
features
Manual Mode	✓	✓	✓
Auto Mode	✓	✖	✖
Editable KV Cache	✓	✖	✓
Multi-GPU	✓	✓	✓
Split GPUs	✖	✓	✖
Load Custom GGUF	✓	✖	✓

Features Explained

Manual Mode: Manual model selection

You can maximize your node’s potential but it requires knowledge and commitment:

You choose which model your node runs.
Performance depends on your choices.
This mode is intended for noderunners who are familiar with language models.

Auto Mode: Automatic model selection

Your node does all the work:

Models are selected automatically.
Performance is balanced.
You don’t need to know anything about language models.

Use Fortytwo P2P App for more complex, real time model auto-management on your system. Fortytwo P2P CLI can recommend options based on your resources but will not be as optimal in its recommendations as the Fortytwo P2P App.

Editable KV Cache

By default, in our applications we use adaptive KV Cache size, so the node can adapt to your hardware. When you launch the node it analyses your available resources and reserves the following:

GPU-based systems (primarily Windows, Linux) — reserves 90% of idle VRAM.
ARM-based systems with unified memory (primarily macOS) — reserves 80 to 85% of leftover RAM.

If KV Cache is editable, you can control the amount of resources taken by caching. Otherwise, it falls back to the default option.Read more here: ‘Performance Balancing’.

Multi-GPU

On systems with several GPUs installed, or when several GPUs are allocated to a single process, the node will utilize all of the available resources from these GPUs.For example: your system is equipped with 2 GPUs, each with 24 GB of VRAM. In this case, your node will read it as a total of 48 GB VRAM and will be able to run bigger models than a single GPU could allow.

Split GPUs

Allows to assign a particular GPU or several GPUs from an available array to a single node.For example: if 8 GPUs are available, it is possible to run up to 8 nodes on this device.

Load Custom GGUF

Allows to select an externally downloaded model in GGUF format.
Otherwise, only allows loading models from the Hugging Face repository, like Strand-Rust-Coder 14B on Hugging Face .

Note that not all GGUF models are immediately supported.

Get Started

App Fortytwo

P2P

API

Legal

Features Explained

Get Started

App Fortytwo

P2P

API

Legal

​Features Explained

Features Explained