I'm using ProcessPoolExecutor context manager to run several Kafka consumers in parallel. I need to store the process IDs of the child processes so that later, I can cleanly terminate those processes. I have such code:
Class MultiProcessConsumer: ... def run_in_parallel(self): parallelism_factor = 5 with ProcessPoolExecutor() as executor: processes = [executor.submit(self.consume) for _ in range(parallelism_factor)] # It would be nice If I could write [process.pid for process in processes] to a file here. def consume(self): while True: for message in self.kafka_consumer: do_stuff(message)
I know I can use os.get_pid()
in the consume method to get PIDs. But, handling them properly (in case of constant shutting down or starting up of consumers) requires some extra work.
How would you propose that I get and store PIDs of the child processes in such a context?