python - Fastest way to create dict from a large list of Dict -


I need to make a dictionary from a large list of decrypts, to remove all duplicate dicts

Input list is something like:

  input = [{'id': 1, 'value1': 'value1', 'value2': 'value2'}, {'id': 2 , 'Value1': 'value1', 'value2': 'value2'}, {'id': 2, 'value1': 'value1', 'value3': 'value4'}]  

And I want to make a dictionary like this, using the "id" value as the key for a new dictionary:

  output = {1: [{'id': 1, 'value1': 'value1', 'value2': 'value2'}] 2: [{'id': 2, 'value1': 'value1', 'value2': 'value My first try was to:  <2> '' id ': 2,' value1 ':' value1 ',' value3 ':' value4 '}]}  

Pre = output = {} for these inputs: If L ['id'] is not in the output or L is not in the output [L ['id']]: Output.SetDefault (L ['id'] , []) .Andand (L)

And it actually works but it is very slow, the lane (input) is about 20k / 30k items

< P> Is there any way to make it a bit quicker?

Thank you!

Use separate set to track the observed dictionaries; You must change them to a pre-eminent representation:

  seen = set () drepr = lambda d: tuple (sorted (d.items ())) output = {} For el in Input: if not seen drepr (L): output.setdefault (el ['id'], []) .andand (l) seen.ed (drepr (el))  

You can slow it down by using it because one can see a method and to call the stack frame without any inventory:

  import from the archive Default D = shown (set)) Drepr = lambda d: tuple (sorted (d.items ()) input = defaultdict (list) L for input: if drepr (el) not seen: output [el ['id ']]. Append (el) Seen.add (drepr (el))  

Demo:

  & gt; & Gt; & Gt; Input = [{'id': 1, 'value1': 'value1', 'value2': 'value2'}, {'id': 2, 'value1': 'value1', 'value2': 'value2'} , {'Id': 2, 'value1': 'value1', 'value3': 'value4'}]> gt; & Gt; & Gt; Seen = set ()> & gt; & Gt; & Gt; Drepr = lambda d: Tupl (Sorted (D. (Articles))) & Gt; & Gt; Output = {}> gt; & Gt; & Gt; For the input in L: ... if not seen drepr (el): ... output.setdefault (el ['id'], []). Append (el) ... seen.add (drepr (el)) ... & gt; & Gt; & Gt; Pprint import from pprint & gt; & Gt; & Gt; Pprint {Output} {{: {{id: 1, 'value1': 'value1', 'value2': 'value2'}], 2: [{'id': 2, 'value1': 'value1', ' Value2 ':' value2 '}, {' id ': 2,' value1 ':' value1 ',' value3 ':' value4 '}]}  

Comments