The goal of this project is to create a reinforcement learning algorithm that locates shipwrecked individuals using a swarm of drones. A simulated environment was developed to train and visualize the outcome of the trained algorithm considering the ocean's dynamic circumstances. This project does not discuss image recognition of shipwrecked people, since the true focus of this project is to optimize the search routine of a drone to find the target in the most efficient way possible. The implemented Reinforce algorithm takes into account a dynamic map of probabilities, representing the chances of a person being found, as well as the position of other agents. Outcomes include an open-source Python package for the environment and the implementation of the reinforcement learning algorithm. The algorithm demonstrates superiority over the predefined approach, proving the advantages of reinforcement learning in efficiency and effectiveness.