0

Ich betreibe ein CNN auf einem Rig mit einer GTX 970 mit 4GB VRAM. Allerdings geht mein Code an die tf.initialize_all_variables(), es heißt, dass es nicht genug Speicher allokieren kann. Hier ist die genaue Zeile:Tensorflow kann keinen Speicher reservieren obwohl er verfügbar ist

W tensorflow/core/common_runtime/bfc_allocator.cc:271] Ran out of memory trying to allocate 625.0KiB. See logs for memory state. 
W tensorflow/core/framework/op_kernel.cc:899] Internal: Dst tensor is not initialized. 
E tensorflow/core/common_runtime/executor.cc:334] Executor failed to create kernel. Internal: Dst tensor is not initialized. 
[[Node: zeros_30 = Const[dtype=DT_FLOAT, value=Tensor<type: float shape: [160000] values: 0 0 0...>, _device="/job:localhost/replica:0/task:0/gpu:0"]()]] 

Wie Sie sehen können, heißt es, dass es nicht 625,0 KiB zuordnen kann, die die 4gb der 970 handhaben soll mit

Hier ist das vollständige Protokoll, wenn es irgendein Hilfe:

I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:924] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties: 
name: GeForce GTX 970 
major: 5 minor: 2 memoryClockRate (GHz) 1.3165 
pciBusID 0000:01:00.0 
Total memory: 3.94GiB 
Free memory: 3.52GiB 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:806] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 970, pci bus id: 0000:01:00.0) 
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (256): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin. 
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (512): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin. 
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (1024): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin. 
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (2048): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin. 
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (4096): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin. 
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (8192): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin. 
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (16384):  Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin. 
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (32768):  Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin. 
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (65536):  Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin. 
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (131072): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin. 
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (262144): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin. 
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (524288): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin. 
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (1048576): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin. 
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (2097152): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin. 
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (4194304): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin. 
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (8388608): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin. 
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (16777216): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin. 
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (33554432): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin. 
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (67108864): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin. 
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (134217728):  Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin. 
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (268435456):  Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin. 
I tensorflow/core/common_runtime/bfc_allocator.cc:656] Bin for 625.0KiB was 512.0KiB, Chunk State: 
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705e40000 of size 256 
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705e40100 of size 256 
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705e40200 of size 256 
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705e40300 of size 256 
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705e40400 of size 256 
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705e40500 of size 8192 
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705e42500 of size 16384 
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705e46500 of size 640000 
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705ee2900 of size 256 
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705ee2a00 of size 256 
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705ee2b00 of size 1024 
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705ee2f00 of size 256 
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705ee3000 of size 51200 
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705eef800 of size 256 
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705eef900 of size 73728 
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705f01900 of size 256 
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705f01a00 of size 73728 
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705f13a00 of size 18432 
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x705f18200 of size 15884288 
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x706e3e200 of size 8192 
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x706e40200 of size 33554432 
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x708e40200 of size 16384 
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0x708e44200 of size 3410411008 
I tensorflow/core/common_runtime/bfc_allocator.cc:689]  Summary of in-use Chunks by size: 
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 10 Chunks of size 256 totalling 2.5KiB 
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 1024 totalling 1.0KiB 
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 2 Chunks of size 8192 totalling 16.0KiB 
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 2 Chunks of size 16384 totalling 32.0KiB 
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 18432 totalling 18.0KiB 
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 51200 totalling 50.0KiB 
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 2 Chunks of size 73728 totalling 144.0KiB 
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 640000 totalling 625.0KiB 
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 15884288 totalling 15.15MiB 
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 33554432 totalling 32.00MiB 
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 3410411008 totalling 3.18GiB 
I tensorflow/core/common_runtime/bfc_allocator.cc:696] Sum Total of in-use chunks: 3.22GiB 
I tensorflow/core/common_runtime/bfc_allocator.cc:698] Stats: 
Limit:     3460759552 
InUse:     3460759552 
MaxInUse:    3460759552 
NumAllocs:      23 
MaxAllocSize:   3410411008 

W tensorflow/core/common_runtime/bfc_allocator.cc:270] ******************************************************************************xxxxxxxxxxxxxxxxxxxxxx 
W tensorflow/core/common_runtime/bfc_allocator.cc:271] Ran out of memory trying to allocate 625.0KiB. See logs for memory state. 
W tensorflow/core/framework/op_kernel.cc:899] Internal: Dst tensor is not initialized. 
E tensorflow/core/common_runtime/executor.cc:334] Executor failed to create kernel. Internal: Dst tensor is not initialized. 
    [[Node: zeros_30 = Const[dtype=DT_FLOAT, value=Tensor<type: float shape: [160000] values: 0 0 0...>, _device="/job:localhost/replica:0/task:0/gpu:0"]()]] 

ich habe versucht, meine Haare worden zerreißt dieses Problem zu beheben, und ich habe bereits reduziert die Größe meines CNN durch viel zu viel für meinen eigenen Komfort.

Auch ich meine Anzeigen von meinem 970 laufen ... muss ich sie in das Motherboard einstecken, um den vollen Nutzen von meinem 970 zu bekommen?

Danke!

Antwort

0

Aus den Protokollen:

Limit:     3460759552 
InUse:     3460759552 
MaxInUse:    3460759552 
NumAllocs:      23 
MaxAllocSize:   3410411008 

Sie sind bereits die Erinnerung an Ihrem GPU maxing, ist das Modell zu groß auf diesem Gerät verarbeitet werden.