Comprobar si hay grid de preguntas en un diccionario

Question

Tengo un archivo pickle que representa preguntas y respuestas en un formato diccionario y me gustaria saber donde estan los que estan grid de preguntas como la siguiente para puestarlas en un csv como gcoronel99 lo hicé aqui:

La estructura de este ejemplo esta como la siguente:

{'question': 'To what extent are the following factors considerations in your choice of flight? ',
   'answers': [None,
    7,
    [[514996986,
      [['Very important consideration'],
       ['Important consideration'],
       ['Neutral'],
       ['Not an important consideration'],
       ['Do not consider']],
      0,
      ['The airline/company you fly with'],
      None,
      None,
      None,
      None,
      None,
      None,
      None,
      [1]],
     [396483948,
      [['Very important consideration'],
       ['Important consideration'],
       ['Neutral'],
       ['Not an important consideration'],
       ['Do not consider']],
      0,
      ['The departure airport'],
      None,
      None,
      None,
      None,
      None,
      None,
      None,
      [1]],
     [971070641,
      [['Very important consideration'],
       ['Important consideration'],
       ['Neutral'],
       ['Not an important consideration'],
       ['Do not consider']],
      0,
      ['Duration of flight/route'],
      None,
      None,
      None,
      None,
      None,
      None,
      None,
      [1]],
     [1685960105,
      [['Very important consideration'],
       ['Important consideration'],
       ['Neutral'],
       ['Not an important consideration'],
       ['Do not consider']],
      0,
      ['Price'],
      None,
      None,
      None,
      None,
      None,
      None,
      None,
      [1]],
     [231217486,
      [['Very important consideration'],
       ['Important consideration'],
       ['Neutral'],
       ['Not an important consideration'],
       ['Do not consider']],
      0,
      ['Baggage policy'],
      None,
      None,
      None,
      None,
      None,
      None,
      None,
      [1]],
     [940990935,
      [['Very important consideration'],
       ['Important consideration'],
       ['Neutral'],
       ['Not an important consideration'],
       ['Do not consider']],
      0,
      ['Environmental impacts'],
      None,
      None,
      None,
      None,
      None,
      None,
      None,
      [1]]]]}

Pero para los demas (si ya existen) pueden variar en funcion de la lineas o columnas. En el caso general pienso que una grid se reconoce cuando hay, en las answers multiples tabulas de tabulas de tabulas.

Entonces intenté saber si funciona a lo menos con el ejemplo arriba:

import pickle

dic = pd.read_pickle(r'Python/interns.p')

def isGrid(qa):
    d_answers = qa['answers']
#     print("qa['question']: ", qa['question'])
    if qa['question'] == 'To what extent are the following factors considerations in your choice of flight?':
        try:
            print("qa['question']: ", qa['question'])
            answers = d_answers[2]
            if len(answers)>1:
                print(True)
                return True
        except TypeError:
            print("truc")
    
for qa_set in dic:
    for qa in qa_set:
        bool_grid = isGrid(qa)

Pero obtengo:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1062-6d260b7f25bc> in <module>
     18 for qa_set in dic:
     19     for qa in qa_set:
---> 20         bool_grid = isGrid(qa)
     21 
     22 # isGrid(qa)

<ipython-input-1062-6d260b7f25bc> in isGrid(qa)
      4 
      5 def isGrid(qa):
----> 6     d_answers = qa['answers']
      7 #     print("qa['question']: ", qa['question'])
      8     if qa['question'] == 'To what extent are the following factors considerations in your choice of flight?':

TypeError: string indices must be integers

diccionarios python python 3.x

user166844 · Answer

¿Por que se obtiene el error?
Primero revisemos que implica iterar sobre un diccionario:
for key in dic:
    print(key)

Se pensaría que este ciclo imprimiría cada par ordenado del diccionario, algo así:
question: To what extent are the following factors considerations in your choice of flight?
answers: [None, 7, [[514996986, [['Very important consideration'], ['Important consideration'], ['Neutral'], ['Not an important consideration'], ['Do not consider']], 0, ['The airline/company you fly with'], None, None, None, None, None, None, None, [1]], [396483948, [['Very important consideration'], ['Important consideration'], ['Neutral'], ['Not an important consideration'], ['Do not consider']], 0, ['The departure airport'], None, None, None, None, None, None, None, [1]], [971070641, [['Very important consideration'], ['Important consideration'], ['Neutral'], ['Not an important consideration'], ['Do not consider']], 0, ['Duration of flight/route'], None, None, None, None, None, None, None, [1]], [1685960105, [['Very important consideration'], ['Important consideration'], ['Neutral'], ['Not an important consideration'], ['Do not consider']], 0, ['Price'], None, None, None, None, None, None, None, [1]], [231217486, [['Very important consideration'], ['Important consideration'], ['Neutral'], ['Not an important consideration'], ['Do not consider']], 0, ['Baggage policy'], None, None, None, None, None, None, None, [1]], [940990935, [['Very important consideration'], ['Important consideration'], ['Neutral'], ['Not an important consideration'], ['Do not consider']], 0, ['Environmental impacts'], None, None, None, None, None, None, None, [1]]]]

¿Qué pasa realmente?
Realmente solo imprime los valores de las llaves como los strings que son:
question
answers

Entonces el ciclo anidado de
for qa_set in dic:
    for qa in qa_set:
        bool_grid = isGrid(qa)

Esta ejecutando la función, con cada letra de ambas palabras. Esto se puede comparar imprimiendo el valor de qa por pantalla:
for qa_set in dic:
    for qa in qa_set:
        print(qa)

muestra
q
u
e
s
t
i
o
n
a
n
s
w
e
r
s

Entonces el error en la linea d_answers = qa['answers'] tiene sentido ahora. La letra individual es un string y se está intentando acceder a una llave como si fuera un diccionario. En otras plabras lo ocurre es esto:
d_answers = "q"['answers']

Por esto mismo ocurre el error
TypeError: string indices must be integers

los string no tiene llaves, solo indices. El primer error que obtienes se da debido al segundo.
¿Realmente la función isGrid() funciona?
La respuesta es que no, en el condicional
if qa['question'] == 'To what extent are the following factors considerations in your choice of flight?':

Estás evaluando si esa llave es igual a ese string, si te fijas en el diccionario dic el valor de la llave ["question"] es este string que evaluas, pero con un espacio al final. Para no modificar los datos la solución sería esta:
if qa['question'].strip() == 'To what extent are the following factors considerations in your choice of flight?':

Debido a que el método .strip() elimina los espacios al final y principio de una cadena.
¿Es necesario ejecutar la función con cada par ordenado de dic?
A mi punto de vista no, debido a que en la función se opera con ambos pares ordenados del diccionario. Solo basta con usar como parámetro de la función a dic.
La función:
def isGrid(qa):
    d_answers = qa['answers']
    if qa['question'].strip() == 'To what extent are the following factors considerations in your choice of flight?':
        try:
            print("qa['question']: ", qa['question'])
            answers = d_answers[2]
            if len(answers) > 1:
                print(True)
                return True
        except TypeError:
            print("truc")

Usando a dic como parámetro:
isGrid(dic)

devolvería
qa['question']:  To what extent are the following factors considerations in your choice of flight?
True

Claramente tendrás tus razones para realizar esa iteración, no sé que tienes en el picke. Sin embargo, al menos ya sabes por que se produce tu error.

Comprobar si hay grid de preguntas en un diccionario

One Answer

¿Por que se obtiene el error?

¿Realmente la función `isGrid()` funciona?

¿Es necesario ejecutar la función con cada par ordenado de `dic`?

Add your own answers!

Ask a Question