jeudi 23 juin 2016

Multiprocessing: why is a numpy array shared with the child processes, while a list is copied?


I used this script (see code at the end) to assess whether a global object is shared or copied when the parent process is forked. Briefly, the script creates a global data object, and the child processes iterate over data. The script also monitors the memory usage to assess whether the object was copied in the child processes. Here are the results: data = np.ones((N,N)). Operation in the child process: data.sum(). Result: data is shared (no copy) data = list(range(pow(10, 8))). Operation in the child process: sum(data). Result: data is copied. data = list(range(pow(10, 8))). Operation in the child process: for x in data: pass. Result: data is copied. Result 1) is expected because of copy-on-write. I am a bit puzzled by the results 2) and 3). Why is data copied? Script source import multiprocessing as mp import numpy as np import logging import os logger = mp.log_to_stderr(logging.WARNING) def free_memory(): total = 0 with open('/proc/meminfo', 'r') as f: for line in f: line = line.strip() if any(line.startswith(field) for field in ('MemFree', 'Buffers', 'Cached')): field, amount, unit = line.split() amount = int(amount) if unit != 'kB': raise ValueError( 'Unknown unit {u!r} in /proc/meminfo'.format(u = unit)) total += amount return total def worker(i): x = data.sum() # Exercise access to data logger.warn('Free memory: {m}'.format(m = free_memory())) def main(): procs = [mp.Process(target = worker, args = (i, )) for i in range(4)] for proc in procs: proc.start() for proc in procs: proc.join() logger.warn('Initial free: {m}'.format(m = free_memory())) N = 15000 data = np.ones((N,N)) logger.warn('After allocating data: {m}'.format(m = free_memory())) if __name__ == '__main__': main() Detailed results Run 1 output [WARNING/MainProcess] Initial free: 25.1 GB [WARNING/MainProcess] After allocating data: 23.3 GB [WARNING/Process-2] Free memory: 23.3 GB [WARNING/Process-4] Free memory: 23.3 GB [WARNING/Process-1] Free memory: 23.3 GB [WARNING/Process-3] Free memory: 23.3 GB Run 2 output [WARNING/MainProcess] Initial free: 25.1 GB [WARNING/MainProcess] After allocating data: 21.9 GB [WARNING/Process-2] Free memory: 12.6 GB [WARNING/Process-4] Free memory: 12.7 GB [WARNING/Process-1] Free memory: 16.3 GB [WARNING/Process-3] Free memory: 17.1 GB Run 3 output [WARNING/MainProcess] Initial free: 25.1 GB [WARNING/MainProcess] After allocating data: 21.9 GB [WARNING/Process-2] Free memory: 12.6 GB [WARNING/Process-4] Free memory: 13.1 GB [WARNING/Process-1] Free memory: 14.6 GB [WARNING/Process-3] Free memory: 19.3 GB

Aucun commentaire:

Enregistrer un commentaire