optimization speed question : array vs dictionary

:information_source: Attention Topic was automatically imported from the old Question2Answer platform.
:bust_in_silhouette: Asked By Inces

I am currently trying to fix huge lag spike in my AI calculation.
trhoroughout whole my code I use a “train” dictionary, which starts with 5 entries, and has up to 30 entries at resolution end. It is passed multiple times as argument and modified inside multiple functions. There are countless of dictionary lookups everywhere. The keys are just enums, values are various.

I need to know if it is worth it to change “train” data structure into an array. I could use the same enums which I used for dict keys, but that would mean “train” will have to always be initialized with all 30 entries at the start of any resolution. I will have to totally redo the code of large chunk of my project.

Can You tell me if difference in speed is going to be noticable ? Can it be twice faster or more ? Are there any other downsides ?

Measure, measure and measure again. It really depends on your code and the profiling you’ve done - is the dictionary lookup really an issue? It should be O(1) and if you know the size it is going to grow, you could size it just once as early as possible. Perhaps passing it around is the issue? Perhaps there should be an owner that does lookup in one place to minimise any accidental copies that are made.

There are many things to speculate on although I’d assume a dictionary isn’t one of them but again: profile and measure, identify the bottleneck, change one thing at a time and measure again.

spaceyjase | 2023-04-20 15:49

if you have 300000 entries there might be a difference between some lookup algorithms, difference between array and dict will probably be even less noticeable.

sand | 2023-04-20 16:36

dictionary should always be prefered over array if you are doing lookup: average O(1) computational complexity for dictionary and O(n) for array.

Indeed, accidental copies might be the source of your trouble as @spaceyJase suggested.

You may also want to look out for parts of your code where you are creating temporary arrays frequently to store things, where instead you could just create an array once, allocated all the entries, and just re-use the same array via array.clear()

Also in the worst case scenario, if your code can’t be optimized (which isn’t likely, but let’s consider that a possibility), you be forced to do your AI logic in a separate thread and add multi-threading to your project

godot_dev_ | 2023-04-20 16:51

Wow, thank guys. Now I see I can’t rely on anything GPTchat says, he is going all out hinting me on optimizations that don’t work.
I am basically working with profiler, but I was afraid of recoding half of my project just to compare with the old code and see insignificant difference.

That is interesting thing You mentioned godot_dev, with storing temporary arrays. How can I use profiler to observe if bottleneck is caused by memory usage as opposed to speed ?

Also : spaceyjace - do You mean searching array with find() is O(n) or accesing it by index ?

Inces | 2023-04-20 18:28

@Inces Not too sure how to use the profiler to detect speed vs. memory usage performance, but below is an example of using a temporary array in an inefficient manner (I beleive you will suffer speed performance due to allocating and freeing an array) to do a summation of all products of pairs of number in a large array

 var largeArray = ...
 var productSum = 0
 for i in largeArray.size():
     for j in largeArray.size():
        #purely for example. using a temprary array here is not necessary
        var tmpArray = [] #allocated every time the for loop repeats
        tmpArray.append(largeArray[i])
        tmpArray.append(largeArray[j])
        productSum = productSum + (tmpArray[0] * tmpArray[1])

Instead, you could use a single array and avoid having it be created every iteration of the for loop as follows

 var largeArray = ...
 var tmpArray = [] #only ever allocated once
 tmpArray.append(null)
 tmpArray.append(null)
 var productSum = 0
 for i in largeArray.size():
     for j in largeArray.size():
        #purely for example. using a temprary array here is not necessary
        tmpArray[0]=largeArray[i]
        tmpArray[1]=largeArray[j]
        productSum = productSum + (tmpArray[0] * tmpArray[1])

A more practical case would be your using an array to return 2 values from a function:

func funcWith2ReturnValues():
    return [123,456]

If the calling functions are only interested in reading the values returned and not populating the returned array with anything, then the below would be a more optimal solution (it would avoid creating an array every time funcWith2ReturnValues is called

var tmpArray = [null,null]
func funcWith2ReturnValues():
    tmpArray[0]=123
    tmpArray[1]=456
    return tmpArray

godot_dev_ | 2023-04-20 18:54

Thanks ! I collected all of your answers and optimizations are going much smoother now. Also I identified all the bottlenecks and know what do I have to change in general design :slight_smile:

Inces | 2023-04-23 16:30