Flickering after GPU is paused/completed run

It seems that a lot of GPU problems revolve around specific versions of drivers. Though NVidia has their own support structure, you can often learn from information reported by others who fold.

Moderators: Site Moderators, FAHC Science Team

Post Reply
ajgringo619
Posts: 23
Joined: Fri Feb 14, 2020 5:14 am

Flickering after GPU is paused/completed run

Post by ajgringo619 »

[System specs are in my sig]

The GTX 1070 is my primary GPU, and only folds when I'm not using the machine. Whenever I shut it down, either by pause or finish, the screen will be begin to flicker - almost like a pulse that jars the whole screen (actually both screens) - but not all the time and not with any noticeable pattern. Since it's my main GPU, I cannot reset it without a reboot.

Does anyone know of a workaround? The easiest way to fix everything is to reboot, which I'm trying to avoid, but is not that big of a deal. Everything else is working great.
Image
gunnarre
Posts: 567
Joined: Sun May 24, 2020 7:23 pm
Location: Norway

Re: Flickering after GPU is paused/completed run

Post by gunnarre »

This sounds like the GPU might be boosting too high when it is cooling down from high load and becoming unstable. Otherwise there might be issues with drivers, or perhaps something wrong with the card itself. Sometimes it might even be system RAM, motherboard or the PSU which is to blame, but let's take one step at a time.

The first thing I'd try is to remove any overclocking, including factory-applied overclocking. If it's not overclocked, try under-clocking it slightly or perhaps giving it a bit more voltage (within reason).

Can you show the output of

Code: Select all

nvidia-smi -q
in code tags? You might want to go through and remove any data which shows your username or running apps - we don't need anything below "Processes" in the output.
Image
Online: GTX 1660 Super, GTX 1080, GTX 1050 Ti 4G OC, RX580 + occasional CPU folding in the cold.
Offline: Radeon HD 7770, GTX 960, GTX 950
ajgringo619
Posts: 23
Joined: Fri Feb 14, 2020 5:14 am

Re: Flickering after GPU is paused/completed run

Post by ajgringo619 »

Requested info is below. The card has been set with factory defaults since I bought it (about 2 years ago). One thing that I'm trying now - and it seems to be helping - is to enable persistence.

Code: Select all

$ nvidia-smi -q

==============NVSMI LOG==============

Timestamp                                 : Tue Jan 11 09:40:30 2022
Driver Version                            : 470.94
CUDA Version                              : 11.4

Attached GPUs                             : 2
GPU 00000000:08:00.0
    Product Name                          : NVIDIA GeForce GTX 1070
    Product Brand                         : GeForce
    Display Mode                          : Enabled
    Display Active                        : Enabled
    Persistence Mode                      : Enabled
    MIG Mode
        Current                           : N/A
        Pending                           : N/A
    Accounting Mode                       : Disabled
    Accounting Mode Buffer Size           : 4000
    Driver Model
        Current                           : N/A
        Pending                           : N/A
    Serial Number                         : 0324516017546
    GPU UUID                              : GPU-1f3d5e07-70af-abf8-84d3-4a0b2c6cbe6f
    Minor Number                          : 0
    VBIOS Version                         : 86.04.26.00.01
    MultiGPU Board                        : No
    Board ID                              : 0x800
    GPU Part Number                       : 900-1G411-0020-000
    Module ID                             : 0
    Inforom Version
        Image Version                     : G001.0000.01.03
        OEM Object                        : 1.1
        ECC Object                        : N/A
        Power Management Object           : N/A
    GPU Operation Mode
        Current                           : N/A
        Pending                           : N/A
    GSP Firmware Version                  : N/A
    GPU Virtualization Mode
        Virtualization Mode               : None
        Host VGPU Mode                    : N/A
    IBMNPU
        Relaxed Ordering Mode             : N/A
    PCI
        Bus                               : 0x08
        Device                            : 0x00
        Domain                            : 0x0000
        Device Id                         : 0x1B8110DE
        Bus Id                            : 00000000:08:00.0
        Sub System Id                     : 0x119D10DE
        GPU Link Info
            PCIe Generation
                Max                       : 3
                Current                   : 3
            Link Width
                Max                       : 16x
                Current                   : 8x
        Bridge Chip
            Type                          : N/A
            Firmware                      : N/A
        Replays Since Reset               : 0
        Replay Number Rollovers           : 0
        Tx Throughput                     : 0 KB/s
        Rx Throughput                     : 15000 KB/s
    Fan Speed                             : 70 %
    Performance State                     : P2
    Clocks Throttle Reasons
        Idle                              : Active
        Applications Clocks Setting       : Not Active
        SW Power Cap                      : Not Active
        HW Slowdown                       : Not Active
            HW Thermal Slowdown           : Not Active
            HW Power Brake Slowdown       : Not Active
        Sync Boost                        : Not Active
        SW Thermal Slowdown               : Not Active
        Display Clock Setting             : Not Active
    FB Memory Usage
        Total                             : 8111 MiB
        Used                              : 584 MiB
        Free                              : 7527 MiB
    BAR1 Memory Usage
        Total                             : 256 MiB
        Used                              : 6 MiB
        Free                              : 250 MiB
    Compute Mode                          : Default
    Utilization
        Gpu                               : 0 %
        Memory                            : 1 %
        Encoder                           : 0 %
        Decoder                           : 0 %
    Encoder Stats
        Active Sessions                   : 0
        Average FPS                       : 0
        Average Latency                   : 0
    FBC Stats
        Active Sessions                   : 0
        Average FPS                       : 0
        Average Latency                   : 0
    Ecc Mode
        Current                           : N/A
        Pending                           : N/A
    ECC Errors
        Volatile
            Single Bit            
                Device Memory             : N/A
                Register File             : N/A
                L1 Cache                  : N/A
                L2 Cache                  : N/A
                Texture Memory            : N/A
                Texture Shared            : N/A
                CBU                       : N/A
                Total                     : N/A
            Double Bit            
                Device Memory             : N/A
                Register File             : N/A
                L1 Cache                  : N/A
                L2 Cache                  : N/A
                Texture Memory            : N/A
                Texture Shared            : N/A
                CBU                       : N/A
                Total                     : N/A
        Aggregate
            Single Bit            
                Device Memory             : N/A
                Register File             : N/A
                L1 Cache                  : N/A
                L2 Cache                  : N/A
                Texture Memory            : N/A
                Texture Shared            : N/A
                CBU                       : N/A
                Total                     : N/A
            Double Bit            
                Device Memory             : N/A
                Register File             : N/A
                L1 Cache                  : N/A
                L2 Cache                  : N/A
                Texture Memory            : N/A
                Texture Shared            : N/A
                CBU                       : N/A
                Total                     : N/A
    Retired Pages
        Single Bit ECC                    : N/A
        Double Bit ECC                    : N/A
        Pending Page Blacklist            : N/A
    Remapped Rows                         : N/A
    Temperature
        GPU Current Temp                  : 61 C
        GPU Shutdown Temp                 : 99 C
        GPU Slowdown Temp                 : 96 C
        GPU Max Operating Temp            : N/A
        GPU Target Temperature            : 83 C
        Memory Current Temp               : N/A
        Memory Max Operating Temp         : N/A
    Power Readings
        Power Management                  : Supported
        Power Draw                        : 30.24 W
        Power Limit                       : 90.00 W
        Default Power Limit               : 151.00 W
        Enforced Power Limit              : 90.00 W
        Min Power Limit                   : 75.00 W
        Max Power Limit                   : 170.00 W
    Clocks
        Graphics                          : 911 MHz
        SM                                : 911 MHz
        Memory                            : 3802 MHz
        Video                             : 810 MHz
    Applications Clocks
        Graphics                          : N/A
        Memory                            : N/A
    Default Applications Clocks
        Graphics                          : N/A
        Memory                            : N/A
    Max Clocks
        Graphics                          : 1911 MHz
        SM                                : 1911 MHz
        Memory                            : 4004 MHz
        Video                             : 1708 MHz
    Max Customer Boost Clocks
        Graphics                          : N/A
    Clock Policy
        Auto Boost                        : N/A
        Auto Boost Default                : N/A
    Voltage
        Graphics                          : N/A
    Processes
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 993
            Type                          : G
            Name                          : /usr/lib/Xorg
            Used GPU Memory               : 410 MiB
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 1752
            Type                          : G
            Name                          : cinnamon
            Used GPU Memory               : 60 MiB
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 30896
            Type                          : G
            Name                          : /usr/lib/brave-bin/brave --type=gpu-process --field-trial-handle=4668161321023072810,10587110712259720672,131072 --enable-crashpad --crashpad-handler-pid=30862 --enable-crash-reporter=da511919-86cd-4b74-b7d9-e87e613ccefd, --change-stack-guard-on-fork=enable --gpu-preferences=UAAAAAAAAAAgAAAIAAAAAAAAAAAAAAAAAABgAAAAAAAwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQAAABgAAAAAAAAAGAAAAAAAAAAIAAAAAAAAAAgAAAAAAAAACAAAAAAAAAA= --shared-files
            Used GPU Memory               : 101 MiB
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 139959
            Type                          : G
            Name                          : scrcpy
            Used GPU Memory               : 6 MiB

GPU 00000000:09:00.0
    Product Name                          : NVIDIA GeForce GTX 1050 Ti
    Product Brand                         : GeForce
    Display Mode                          : Disabled
    Display Active                        : Disabled
    Persistence Mode                      : Enabled
    MIG Mode
        Current                           : N/A
        Pending                           : N/A
    Accounting Mode                       : Disabled
    Accounting Mode Buffer Size           : 4000
    Driver Model
        Current                           : N/A
        Pending                           : N/A
    Serial Number                         : N/A
    GPU UUID                              : GPU-6d0fe567-6144-eac8-50a3-d07e073ad901
    Minor Number                          : 1
    VBIOS Version                         : 86.07.39.00.52
    MultiGPU Board                        : No
    Board ID                              : 0x900
    GPU Part Number                       : N/A
    Module ID                             : 0
    Inforom Version
        Image Version                     : G001.0000.01.04
        OEM Object                        : 1.1
        ECC Object                        : N/A
        Power Management Object           : N/A
    GPU Operation Mode
        Current                           : N/A
        Pending                           : N/A
    GSP Firmware Version                  : N/A
    GPU Virtualization Mode
        Virtualization Mode               : None
        Host VGPU Mode                    : N/A
    IBMNPU
        Relaxed Ordering Mode             : N/A
    PCI
        Bus                               : 0x09
        Device                            : 0x00
        Domain                            : 0x0000
        Device Id                         : 0x1C8210DE
        Bus Id                            : 00000000:09:00.0
        Sub System Id                     : 0x62553842
        GPU Link Info
            PCIe Generation
                Max                       : 3
                Current                   : 3
            Link Width
                Max                       : 16x
                Current                   : 8x
        Bridge Chip
            Type                          : N/A
            Firmware                      : N/A
        Replays Since Reset               : 0
        Replay Number Rollovers           : 0
        Tx Throughput                     : 345000 KB/s
        Rx Throughput                     : 3226000 KB/s
    Fan Speed                             : 75 %
    Performance State                     : P0
    Clocks Throttle Reasons
        Idle                              : Not Active
        Applications Clocks Setting       : Not Active
        SW Power Cap                      : Active
        HW Slowdown                       : Not Active
            HW Thermal Slowdown           : Not Active
            HW Power Brake Slowdown       : Not Active
        Sync Boost                        : Not Active
        SW Thermal Slowdown               : Not Active
        Display Clock Setting             : Not Active
    FB Memory Usage
        Total                             : 4040 MiB
        Used                              : 349 MiB
        Free                              : 3691 MiB
    BAR1 Memory Usage
        Total                             : 256 MiB
        Used                              : 4 MiB
        Free                              : 252 MiB
    Compute Mode                          : Default
    Utilization
        Gpu                               : 100 %
        Memory                            : 78 %
        Encoder                           : 0 %
        Decoder                           : 0 %
    Encoder Stats
        Active Sessions                   : 0
        Average FPS                       : 0
        Average Latency                   : 0
    FBC Stats
        Active Sessions                   : 0
        Average FPS                       : 0
        Average Latency                   : 0
    Ecc Mode
        Current                           : N/A
        Pending                           : N/A
    ECC Errors
        Volatile
            Single Bit            
                Device Memory             : N/A
                Register File             : N/A
                L1 Cache                  : N/A
                L2 Cache                  : N/A
                Texture Memory            : N/A
                Texture Shared            : N/A
                CBU                       : N/A
                Total                     : N/A
            Double Bit            
                Device Memory             : N/A
                Register File             : N/A
                L1 Cache                  : N/A
                L2 Cache                  : N/A
                Texture Memory            : N/A
                Texture Shared            : N/A
                CBU                       : N/A
                Total                     : N/A
        Aggregate
            Single Bit            
                Device Memory             : N/A
                Register File             : N/A
                L1 Cache                  : N/A
                L2 Cache                  : N/A
                Texture Memory            : N/A
                Texture Shared            : N/A
                CBU                       : N/A
                Total                     : N/A
            Double Bit            
                Device Memory             : N/A
                Register File             : N/A
                L1 Cache                  : N/A
                L2 Cache                  : N/A
                Texture Memory            : N/A
                Texture Shared            : N/A
                CBU                       : N/A
                Total                     : N/A
    Retired Pages
        Single Bit ECC                    : N/A
        Double Bit ECC                    : N/A
        Pending Page Blacklist            : N/A
    Remapped Rows                         : N/A
    Temperature
        GPU Current Temp                  : 73 C
        GPU Shutdown Temp                 : 102 C
        GPU Slowdown Temp                 : 99 C
        GPU Max Operating Temp            : N/A
        GPU Target Temperature            : 83 C
        Memory Current Temp               : N/A
        Memory Max Operating Temp         : N/A
    Power Readings
        Power Management                  : Supported
        Power Draw                        : N/A
        Power Limit                       : 75.00 W
        Default Power Limit               : 75.00 W
        Enforced Power Limit              : 75.00 W
        Min Power Limit                   : 52.50 W
        Max Power Limit                   : 75.00 W
    Clocks
        Graphics                          : 1695 MHz
        SM                                : 1695 MHz
        Memory                            : 3504 MHz
        Video                             : 1518 MHz
    Applications Clocks
        Graphics                          : N/A
        Memory                            : N/A
    Default Applications Clocks
        Graphics                          : N/A
        Memory                            : N/A
    Max Clocks
        Graphics                          : 1987 MHz
        SM                                : 1987 MHz
        Memory                            : 3504 MHz
        Video                             : 1708 MHz
    Max Customer Boost Clocks
        Graphics                          : N/A
    Clock Policy
        Auto Boost                        : N/A
        Auto Boost Default                : N/A
    Voltage
        Graphics                          : N/A
    Processes
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 993
            Type                          : G
            Name                          : /usr/lib/Xorg
            Used GPU Memory               : 4 MiB
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 116233
            Type                          : C
            Name                          : /var/lib/private/fah/cores/cores.foldingathome.org/lin/64bit/22-0.0.18/Core_22.fah/FahCore_22
            Used GPU Memory               : 341 MiB
Image
ajgringo619
Posts: 23
Joined: Fri Feb 14, 2020 5:14 am

Re: Flickering after GPU is paused/completed run

Post by ajgringo619 »

For now, this issue seems to be fixed, with the following settings:
1) Enable the nvidia-persistenced.service
2) Reset the power defaults (151W)
3) When possible, let a WU finish instead of pausing it mid-stream (this one seems to be the most important)
Image
Post Reply