最近在用Keras跑程序的时候,为了方便,想让它自动的进行循环测试,就用了for循环,结果跑了没几个循环就显示内存已满。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 |
2019-04-24 13:56:03.386390: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 162 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0, compute capability: 6.1) WARNING:tensorflow:From /home/ubuntu/anaconda3/envs/3dlstm/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:1062: calling reduce_prod (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version. Instructions for updating: keep_dims is deprecated, use keepdims instead 2019-04-24 13:56:14.278430: W tensorflow/core/common_runtime/bfc_allocator.cc:267] Allocator (GPU_0_bfc) ran out of memory trying to allocate 128.00MiB. Current allocation summary follows. 2019-04-24 13:56:14.278531: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (256): Total Chunks: 20, Chunks in use: 20. 5.0KiB allocated for chunks. 5.0KiB in use in bin. 584B client-requested in use in bin. 2019-04-24 13:56:14.278560: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (512): Total Chunks: 2, Chunks in use: 2. 1.0KiB allocated for chunks. 1.0KiB in use in bin. 1.0KiB client-requested in use in bin. 2019-04-24 13:56:14.278581: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (1024): Total Chunks: 4, Chunks in use: 4. 4.2KiB allocated for chunks. 4.2KiB in use in bin. 4.0KiB client-requested in use in bin. 2019-04-24 13:56:14.278602: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (2048): Total Chunks: 6, Chunks in use: 6. 12.0KiB allocated for chunks. 12.0KiB in use in bin. 11.9KiB client-requested in use in bin. 2019-04-24 13:56:14.278621: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (4096): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-04-24 13:56:14.278640: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (8192): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-04-24 13:56:14.278661: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (16384): Total Chunks: 2, Chunks in use: 2. 36.2KiB allocated for chunks. 36.2KiB in use in bin. 36.2KiB client-requested in use in bin. 2019-04-24 13:56:14.278679: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (32768): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-04-24 13:56:14.278697: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (65536): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-04-24 13:56:14.278715: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (131072): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-04-24 13:56:14.278733: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (262144): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-04-24 13:56:14.278754: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (524288): Total Chunks: 1, Chunks in use: 1. 975.5KiB allocated for chunks. 975.5KiB in use in bin. 864.0KiB client-requested in use in bin. 2019-04-24 13:56:14.278773: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (1048576): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-04-24 13:56:14.278792: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (2097152): Total Chunks: 1, Chunks in use: 0. 3.99MiB allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-04-24 13:56:14.278812: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (4194304): Total Chunks: 1, Chunks in use: 1. 4.00MiB allocated for chunks. 4.00MiB in use in bin. 3.38MiB client-requested in use in bin. 2019-04-24 13:56:14.278832: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (8388608): Total Chunks: 1, Chunks in use: 1. 8.00MiB allocated for chunks. 8.00MiB in use in bin. 6.75MiB client-requested in use in bin. 2019-04-24 13:56:14.278853: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (16777216): Total Chunks: 2, Chunks in use: 2. 43.00MiB allocated for chunks. 43.00MiB in use in bin. 40.50MiB client-requested in use in bin. 2019-04-24 13:56:14.278874: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (33554432): Total Chunks: 2, Chunks in use: 2. 69.00MiB allocated for chunks. 69.00MiB in use in bin. 54.00MiB client-requested in use in bin. 2019-04-24 13:56:14.278895: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (67108864): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-04-24 13:56:14.278913: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (134217728): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-04-24 13:56:14.278931: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (268435456): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-04-24 13:56:14.278950: I tensorflow/core/common_runtime/bfc_allocator.cc:613] Bin for 128.00MiB was 128.00MiB, Chunk State: 2019-04-24 13:56:14.278970: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f6834000000 of size 28311552 2019-04-24 13:56:14.278986: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f6835b00000 of size 38797312 2019-04-24 13:56:14.279008: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f683de00000 of size 33554432 2019-04-24 13:56:14.279030: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f6850c00000 of size 16777216 2019-04-24 13:56:14.279046: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f6851c00000 of size 8388608 2019-04-24 13:56:14.279063: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f6880200000 of size 1024 2019-04-24 13:56:14.279078: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f6880200400 of size 1024 2019-04-24 13:56:14.279095: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f6880200800 of size 2048 2019-04-24 13:56:14.279110: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f6880201000 of size 2048 2019-04-24 13:56:14.279126: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f6880201800 of size 2048 2019-04-24 13:56:14.279142: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f6880202000 of size 2048 2019-04-24 13:56:14.279158: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Free at 0x7f6880202800 of size 4184064 2019-04-24 13:56:14.279174: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f6890400000 of size 4194304 2019-04-24 13:56:14.279190: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a4400000 of size 256 2019-04-24 13:56:14.279206: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a4400100 of size 256 2019-04-24 13:56:14.279222: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a4400200 of size 256 2019-04-24 13:56:14.279238: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a4400300 of size 512 2019-04-24 13:56:14.279253: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a4400500 of size 256 2019-04-24 13:56:14.279269: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a4400600 of size 256 2019-04-24 13:56:14.279284: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a4400700 of size 256 2019-04-24 13:56:14.279300: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a4400800 of size 256 2019-04-24 13:56:14.279315: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a4400900 of size 1024 2019-04-24 13:56:14.279331: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a4400d00 of size 256 2019-04-24 13:56:14.279346: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a4400e00 of size 256 2019-04-24 13:56:14.279361: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a4400f00 of size 256 2019-04-24 13:56:14.279377: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a4401000 of size 256 2019-04-24 13:56:14.279394: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a4401100 of size 2048 2019-04-24 13:56:14.279410: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a4401900 of size 256 2019-04-24 13:56:14.279425: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a4401a00 of size 256 2019-04-24 13:56:14.279440: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a4401b00 of size 256 2019-04-24 13:56:14.279456: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a4401c00 of size 256 2019-04-24 13:56:14.279472: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a4401d00 of size 16384 2019-04-24 13:56:14.279487: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a4405d00 of size 256 2019-04-24 13:56:14.279503: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a4405e00 of size 256 2019-04-24 13:56:14.279518: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a4405f00 of size 2048 2019-04-24 13:56:14.279534: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a4406700 of size 256 2019-04-24 13:56:14.279550: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a4406800 of size 256 2019-04-24 13:56:14.279566: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a4406900 of size 1280 2019-04-24 13:56:14.279582: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a4406e00 of size 256 2019-04-24 13:56:14.279598: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a4406f00 of size 20736 2019-04-24 13:56:14.279614: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a440c000 of size 512 2019-04-24 13:56:14.279630: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x7f68a440c200 of size 998912 2019-04-24 13:56:14.279646: I tensorflow/core/common_runtime/bfc_allocator.cc:638] Summary of in-use Chunks by size: 2019-04-24 13:56:14.279666: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 20 Chunks of size 256 totalling 5.0KiB 2019-04-24 13:56:14.279685: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 2 Chunks of size 512 totalling 1.0KiB 2019-04-24 13:56:14.279702: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 3 Chunks of size 1024 totalling 3.0KiB 2019-04-24 13:56:14.279720: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 1280 totalling 1.2KiB 2019-04-24 13:56:14.279738: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 6 Chunks of size 2048 totalling 12.0KiB 2019-04-24 13:56:14.279756: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 16384 totalling 16.0KiB 2019-04-24 13:56:14.279774: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 20736 totalling 20.2KiB 2019-04-24 13:56:14.279793: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 998912 totalling 975.5KiB 2019-04-24 13:56:14.279811: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 4194304 totalling 4.00MiB 2019-04-24 13:56:14.279828: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 8388608 totalling 8.00MiB 2019-04-24 13:56:14.279846: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 16777216 totalling 16.00MiB 2019-04-24 13:56:14.279864: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 28311552 totalling 27.00MiB 2019-04-24 13:56:14.279882: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 33554432 totalling 32.00MiB 2019-04-24 13:56:14.279900: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 38797312 totalling 37.00MiB 2019-04-24 13:56:14.279934: I tensorflow/core/common_runtime/bfc_allocator.cc:645] Sum Total of in-use chunks: 125.01MiB 2019-04-24 13:56:14.279960: I tensorflow/core/common_runtime/bfc_allocator.cc:647] Stats: Limit: 170393600 InUse: 131082240 MaxInUse: 131082240 NumAllocs: 41 MaxAllocSize: 38797312 2019-04-24 13:56:14.279989: W tensorflow/core/common_runtime/bfc_allocator.cc:271] ******************************************xxxxxxx**********************xxx***********x* 2019-04-24 13:56:14.280057: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at random_op.cc:202 : Resource exhausted: OOM when allocating tensor with shape[8192,4096] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc |
出这个错误的意思其实就是说内存已满,为了解决这个问题,可以选择减小每次for循环的数量,比如我现在是一次性循环450次,实测一次最多循环32次就报错了,所以我们可以减小循环次数,不过这种方法比较麻烦,那么有没有更加简单的呢?当然是有的~~
首先,我们要禁止预分配GPU内存,因为Keras现在是每次一运行就自动占满所有内存,我们当然不想这样,我们想要让Keras只占用实际的内存,只需要在前面加上:
1 2 3 4 5 6 7 |
import tensorflow as tf import keras.backend.tensorflow_backend as K config = tf.ConfigProto() config.gpu_options.allow_growth=True sess = tf.Session(config=config) K.set_session(sess) |
这样每次只会分配所需的内存。不过我实测,只进行这一步还是会报错,因为for循环一直在增加内存,这时我们就想,可不可以让Keras每次一调用完模型就释放内存呢?答案是肯定的,我们只需要在每次调用完模型之后加上:
1 2 3 4 |
import tensorflow as tf from keras import backend as K K.clear_session() tf.reset_default_graph() |
这样就行啦~~无论你是用多少次for循环,内存都不会增加了~
1 2 3 4 5 6 7 8 9 10 11 12 |
Every 2.0s: nvidia-smi Wed Apr 24 13:55:38 2019 Wed Apr 24 13:55:38 2019 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 390.25 Driver Version: 390.25| |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GTX 108... Off |00000000:02:00.0 Off | N/A | | 20% 52C P2 55W / 250W | 10791MiB / 11178MiB | 0% Default | +-------------------------------+----------------------+----------------------+ |
本文最后更新于2019年5月21日,已超过 1 年没有更新,如果文章内容或图片资源失效,请留言反馈,我们会及时处理,谢谢!
我之前也遇到了这个问题,我这边测试的结果有两个地方和你的不一样:
1、
import tensorflow as tf
import keras.backend.tensorflow_backend as K
config = tf.ConfigProto()
config.gpu_options.allow_growth=True
sess = tf.Session(config=config)
K.set_session(sess)
可以去掉最后的K.set_session(sess)也有用
2、
import tensorflow as tf
from keras import backend as K
K.clear_session()
tf.reset_default_graph()
可以去掉tf.reset_default_graph()也管用
我是用的keras,不是直接用tensorflow