InvalidArgumentError: 2 root error(s) found. Incompatible shapes in Tensorflow text-classification model

Question 1

InvalidArgumentError: 2 root error(s) found. Incompatible shapes in Tensorflow text-classification model

python tensorflow deep-learning nlp text-classification

connor449 · Jan 22, 2020 · Viewed 7.3k times · Source

Answer

Answer

Introduction

I think the main problem is a data mismatch in the sizes of the arrays (or matrixes or other structure) you are feeding sess.run. Specifically when you are calling:

train_loss, train_accuracy, _ = sess.run([model.loss, model.accuracy,model.train_op], feed_dict = {model.word_ids: word_idss, model.utterance_lengths: utterance_lengthss, model.dialogue_lengths: dialogue_lengthss, model.labels:labs_t, model.clip :clip} )

And more specifically, this error here hints that it's a mismatch problem:

tensorflow.python.framework.errors_impl.InvalidArgumentError: 
indices[317] = [317, -1] does not index into param shape [318,39,400]
             [[{{node utterance_encoder/GatherNd}}]]

I considered maybe that running on a fresh install might result in a error-free run.

I am getting similar errors but also a whole list of warnings. Please note I am running on windows 7 and using python 3.6.1.

Versions

I have tried the following tensorflow versions but with no success:

tf 1.15
tf 1.14
tf 1.13.1
tf 1.12
tf 1.11
tf 1.10
tf 1.10 with downgraded keras to 2.2.1

Steps

Installed python 3.6.1 (supported version of tensorflow). Installed for All Users. Set the Path. Installed in C:\Python36
pip3 install --user --upgrade tensorflow==1.15
pip3 install --user --upgrade pandas == 0.25.3
pip3 install --user --upgrade numpy == 1.17.5
Download the following: https://github.com/cmeaton/Hierarchical_BiLSTM-CRF_Encoder/tree/master/swda_parsed
Run the provided code

Result (Includes Many Warnings)

I think the following might be important:

tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[317] = [317, -1] does not index into param shape [318,39,400]
         [[{{node utterance_encoder/GatherNd}}]]

Full Trace

WARNING:tensorflow:From test.py:313: The name tf.reset_default_graph is deprecated. Please use tf.compat.v1.reset_default_graph instead.

WARNING:tensorflow:From test.py:256: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

WARNING:tensorflow:From test.py:259: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

2020-01-31 12:13:10.096283: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
WARNING:tensorflow:From test.py:119: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

WARNING:tensorflow:From test.py:121: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

WARNING:tensorflow:From test.py:130: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead.

WARNING:tensorflow:From test.py:137: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
WARNING:tensorflow:From test.py:147: LSTMCell.__init__ (from tensorflow.python.ops.rnn_cell_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This class is equivalent as tf.keras.layers.LSTMCell, and will be replaced by that in Tensorflow 2.0.
WARNING:tensorflow:From test.py:150: bidirectional_dynamic_rnn (from tensorflow.python.ops.rnn) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `keras.layers.Bidirectional(keras.layers.RNN(cell))`, which is equivalent to this API
WARNING:tensorflow:From D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\ops\rnn.py:464: dynamic_rnn (from tensorflow.python.ops.rnn) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `keras.layers.RNN(cell)`, which is equivalent to this API
WARNING:tensorflow:From D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\ops\rnn_cell_impl.py:958: Layer.add_variable (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.add_weight` method instead.
WARNING:tensorflow:From D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\ops\rnn_cell_impl.py:962: calling Zeros.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\ops\rnn.py:244: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.

WARNING:tensorflow:From test.py:163: BasicLSTMCell.__init__ (from tensorflow.python.ops.rnn_cell_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This class is equivalent as tf.keras.layers.LSTMCell, and will be replaced by that in Tensorflow 2.0.
WARNING:tensorflow:From test.py:223: The name tf.train.AdagradOptimizer is deprecated. Please use tf.compat.v1.train.AdagradOptimizer instead.

WARNING:tensorflow:From D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\training\adagrad.py:76: calling Constant.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From test.py:261: The name tf.global_variables_initializer is deprecated. Please use tf.compat.v1.global_variables_initializer instead.

WARNING:tensorflow:From test.py:263: The name tf.train.Saver is deprecated. Please use tf.compat.v1.train.Saver instead.

WARNING:tensorflow:From test.py:265: The name tf.summary.FileWriter is deprecated. Please use tf.compat.v1.summary.FileWriter instead.

2020-01-31 12:13:16.563989: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at gather_nd_op.cc:47 : Invalid argument: indices[317] = [317, -1] does not index into param shape [318,39,400]
Traceback (most recent call last):
  File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\client\session.py", line 1365, in _do_call
    return fn(*args)
  File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\client\session.py", line 1350, in _run_fn
    target_list, run_metadata)
  File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\client\session.py", line 1443, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[317] = [317, -1] does not index into param shape [318,39,400]
         [[{{node utterance_encoder/GatherNd}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "test.py", line 314, in <module>
    main()
  File "test.py", line 274, in main
    train_loss, train_accuracy, _ = sess.run([model.loss, model.accuracy,model.train_op], feed_dict = {model.word_ids: word_idss, model.utterance_lengths: utterance_lengthss, model.dialogue_lengths: dialogue_lengthss, model.labels:labs_t, model.clip :clip} )
  File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\client\session.py", line 956, in run
    run_metadata_ptr)
  File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\client\session.py", line 1180, in _run
    feed_dict_tensor, options, run_metadata)
  File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\client\session.py", line 1359, in _do_run
    run_metadata)
  File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\client\session.py", line 1384, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[317] = [317, -1] does not index into param shape [318,39,400]
         [[node utterance_encoder/GatherNd (defined at D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\framework\ops.py:1748) ]]

Original stack trace for 'utterance_encoder/GatherNd':
  File "test.py", line 314, in <module>
    main()
  File "test.py", line 260, in main
    model = DAModel()
  File "test.py", line 155, in __init__
    output = select(output, length) # [batch_size, dim]
  File "test.py", line 114, in select
    return tf.gather_nd(parameters, idx)
  File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\util\dispatch.py", line 180, in wrapper
    return target(*args, **kwargs)
  File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\ops\array_ops.py", line 4277, in gather_nd
    return gen_array_ops.gather_nd(params, indices, name=name)
  File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\ops\gen_array_ops.py", line 3975, in gather_nd
    "GatherNd", params=params, indices=indices, name=name)
  File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\framework\op_def_library.py", line 794, in _apply_op_helper
    op_def=op_def)
  File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\util\deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\framework\ops.py", line 3357, in create_op
    attrs, op_def, compute_device)
  File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\framework\ops.py", line 3426, in _create_op_internal
    op_def=op_def)
  File "D:\Users\bakopme\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\framework\ops.py", line 1748, in __init__
    self._traceback = tf_stack.extract_stack()

Question 2

I am trying to get code working from the following repo, which is based off this paper. It had a lot of errors, but I mostly got it working. However, I keep getting the same problem and I really do not understand how to troubleshoot this/what is even going wrong.

The error occurs the second time the validation if statement critera is met. The first time is always works, then breaks on the second. I'm including the output it prints before breaking if its helpful. See error below:

Here is the code (which is slightly different from the repo in order to get it to run:

Versions: Python 3

tensorflow == 1.15.0

pandas == 0.25.3

numpy == 1.17.5

import glob
import pandas as pd
import tensorflow as tf
import pandas as pd
import numpy as np


# preprocess data

file_list = []
for f in glob.glob('swda/*'):
  file_list.append(f)

df_list = []
for i in file_list:
  df = pd.read_csv(i)
  df_list.append(df)

text_list = []
label_list = []

for df in df_list:
  df['utterance_no_specialchar_'] = df.utterance_no_specialchar.astype(str)
  text = df.utterance_no_specialchar_.tolist()
  labels = df.da_category.tolist()
  text_list.append(text)
  label_list.append(labels)

### new preprocessing step  
text_list = [[[j] for j in i] for i in text_list]

tok_data = [y[0] for x in text_list for y in x]

tokenizer = tf.keras.preprocessing.text.Tokenizer()
tokenizer.fit_on_texts(tok_data)

sequences = []
for x in text_list:
  tmp = []
  for y in x:
    tmp.append(tokenizer.texts_to_sequences(y)[0])
  sequences.append(tmp)

def _pad_sequences(sequences, pad_tok, max_length):
    """
    Args:
        sequences: a generator of list or tuple
        pad_tok: the char to pad with
    Returns:
        a list of list where each sublist has same length
    """
    sequence_padded, sequence_length = [], []

    for seq in sequences:
        seq = list(seq)
        seq_ = seq[:max_length] + [pad_tok]*max(max_length - len(seq), 0)
        sequence_padded +=  [seq_]
        sequence_length += [min(len(seq), max_length)]

    return sequence_padded, sequence_length

def pad_sequences(sequences, pad_tok, nlevels=1):
    """
    Args:
        sequences: a generator of list or tuple
        pad_tok: the char to pad with
        nlevels: "depth" of padding, for the case where we have characters ids
    Returns:
        a list of list where each sublist has same length
    """
    if nlevels == 1:
        max_length = max(map(lambda x : len(x), sequences))
        sequence_padded, sequence_length = _pad_sequences(sequences,
                                            pad_tok, max_length)

    elif nlevels == 2:
        max_length_word = max([max(map(lambda x: len(x), seq))
                               for seq in sequences])
        sequence_padded, sequence_length = [], []
        for seq in sequences:
            # all words are same length now
            sp, sl = _pad_sequences(seq, pad_tok, max_length_word)
            sequence_padded += [sp]
            sequence_length += [sl]

        max_length_sentence = max(map(lambda x : len(x), sequences))
        
        sequence_padded, _ = _pad_sequences(sequence_padded,
                [pad_tok]*max_length_word, max_length_sentence)
        sequence_length, _ = _pad_sequences(sequence_length, 0,
                max_length_sentence)

    return sequence_padded, sequence_length

def minibatches(data, labels, batch_size):
  data_size = len(data)
  start_index = 0

  num_batches_per_epoch = int((len(data) + batch_size - 1) / batch_size)
  for batch_num in range(num_batches_per_epoch):
      start_index = batch_num * batch_size
      end_index = min((batch_num + 1) * batch_size, data_size)
      yield data[start_index: end_index], labels[start_index: end_index]

def select(parameters, length):
  """Select the last valid time step output as the sentence embedding
  :params parameters: [batch, seq_len, hidden_dims]
  :params length: [batch]
  :Returns : [batch, hidden_dims]
  """
  shape = tf.shape(parameters)
  idx = tf.range(shape[0])
  idx = tf.stack([idx, length - 1], axis = 1)
  return tf.gather_nd(parameters, idx)


class DAModel():
    def __init__(self):
        with tf.variable_scope("placeholder"):

            self.dialogue_lengths = tf.placeholder(tf.int32, shape = [None], name = "dialogue_lengths")
            self.word_ids = tf.placeholder(tf.int32, shape = [None,None,None], name = "word_ids")
            self.utterance_lengths = tf.placeholder(tf.int32, shape = [None, None], name = "utterance_lengths")
            self.labels = tf.placeholder(tf.int32, shape = [None, None], name = "labels")
            self.clip = tf.placeholder(tf.float32, shape = [], name = 'clip')

######################## EMBEDDINGS ###########################################

        with tf.variable_scope("embeddings"):
            _word_embeddings = tf.get_variable(
                name = "_word_embeddings",
                dtype = tf.float32,
                shape = [words, word_dim],
                initializer = tf.random_uniform_initializer()
                )
            word_embeddings = tf.nn.embedding_lookup(_word_embeddings,self.word_ids, name="word_embeddings")
            self.word_embeddings = tf.nn.dropout(word_embeddings, 0.8)
                    
        with tf.variable_scope("utterance_encoder"):
            s = tf.shape(self.word_embeddings)
            batch_size = s[0] * s[1]
            
            time_step = s[-2]
            word_embeddings = tf.reshape(self.word_embeddings, [batch_size, time_step, word_dim])
            length = tf.reshape(self.utterance_lengths, [batch_size])

            fw = tf.nn.rnn_cell.LSTMCell(hidden_size_lstm_1, forget_bias=0.8, state_is_tuple= True)
            bw = tf.nn.rnn_cell.LSTMCell(hidden_size_lstm_1, forget_bias=0.8, state_is_tuple= True)
            
            output, _ = tf.nn.bidirectional_dynamic_rnn(fw, bw, word_embeddings,sequence_length=length, dtype = tf.float32)
            output = tf.concat(output, axis = -1) # [batch_size, time_step, dim]
            # Select the last valid time step output as the utterance embedding, 
            # this method is more concise than TensorArray with while_loop
            # output = select(output, self.utterance_lengths) # [batch_size, dim]
            output = select(output, length) # [batch_size, dim]

            # output = tf.reshape(output, s[0], s[1], 2 * hidden_size_lstm_1)
            output = tf.reshape(output, [s[0], s[1], 2 * hidden_size_lstm_1])

            output = tf.nn.dropout(output, 0.8)

        with tf.variable_scope("bi-lstm"):
            cell_fw = tf.contrib.rnn.BasicLSTMCell(hidden_size_lstm_2, state_is_tuple = True)
            cell_bw = tf.contrib.rnn.BasicLSTMCell(hidden_size_lstm_2, state_is_tuple = True)
            
            (output_fw, output_bw), _ = tf.nn.bidirectional_dynamic_rnn(cell_fw, cell_bw, output, sequence_length = self.dialogue_lengths, dtype = tf.float32)
            outputs = tf.concat([output_fw, output_bw], axis = -1)
            outputs = tf.nn.dropout(outputs, 0.8)
        
        with tf.variable_scope("proj1"):
            output = tf.reshape(outputs, [-1, 2 * hidden_size_lstm_2])
            W = tf.get_variable("W", dtype = tf.float32, shape = [2 * hidden_size_lstm_2, proj1], initializer= tf.contrib.layers.xavier_initializer())
            b = tf.get_variable("b", dtype = tf.float32, shape = [proj1], initializer=tf.zeros_initializer())
            output = tf.nn.relu(tf.matmul(output, W) + b)

        with tf.variable_scope("proj2"):
            W = tf.get_variable("W", dtype = tf.float32, shape = [proj1, proj2], initializer= tf.contrib.layers.xavier_initializer())
            b = tf.get_variable("b", dtype = tf.float32, shape = [proj2], initializer=tf.zeros_initializer())
            output = tf.nn.relu(tf.matmul(output, W) + b)

        with tf.variable_scope("logits"):
            nstep = tf.shape(outputs)[1]
            W = tf.get_variable("W", dtype = tf.float32,shape=[proj2, tags], initializer = tf.random_uniform_initializer())
            b = tf.get_variable("b", dtype = tf.float32,shape = [tags],initializer=tf.zeros_initializer())

            pred = tf.matmul(output, W) + b
            self.logits = tf.reshape(pred, [-1, nstep, tags])
        
        with tf.variable_scope("loss"):
            log_likelihood, self.trans_params = tf.contrib.crf.crf_log_likelihood(
                        self.logits, self.labels, self.dialogue_lengths)
            self.loss = tf.reduce_mean(-log_likelihood) + tf.nn.l2_loss(W) + tf.nn.l2_loss(b)
            #tf.summary.scalar("loss", self.loss)
        

        with tf.variable_scope("viterbi_decode"):
            viterbi_sequence, _ = tf.contrib.crf.crf_decode(self.logits, self.trans_params,  self.dialogue_lengths)
            
            batch_size = tf.shape(self.dialogue_lengths)[0]

            output_ta = tf.TensorArray(dtype = tf.float32, size = 1, dynamic_size = True)
            def body(time, output_ta_1):
                length = self.dialogue_lengths[time]
                vcode = viterbi_sequence[time][:length]
                true_labs = self.labels[time][:length]
                accurate = tf.reduce_sum(tf.cast(tf.equal(vcode, true_labs), tf.float32))

                output_ta_1 = output_ta_1.write(time, accurate)
                return time + 1, output_ta_1


            def condition(time, output_ta_1):
                return time < batch_size

            i = 0
            [time, output_ta] = tf.while_loop(condition, body, loop_vars = [i, output_ta])
            output_ta = output_ta.stack()
            accuracy = tf.reduce_sum(output_ta)
            self.accuracy = accuracy / tf.reduce_sum(tf.cast(self.dialogue_lengths, tf.float32))
            #tf.summary.scalar("accuracy", self.accuracy)

        with tf.variable_scope("train_op"):
            optimizer = tf.train.AdagradOptimizer(0.1)
            #if tf.greater(self.clip , 0):
            grads, vs = zip(*optimizer.compute_gradients(self.loss))
            grads, gnorm = tf.clip_by_global_norm(grads, self.clip)
            self.train_op = optimizer.apply_gradients(zip(grads, vs))
            #else:
            #    self.train_op = optimizer.minimize(self.loss)
        #self.merged = tf.summary.merge_all()


### Set model variables

hidden_size_lstm_1 = 200
hidden_size_lstm_2 = 200
tags = 39 # assuming number of classes to predict?
word_dim = 300
proj1 = 200
proj2 = 100
words = 20001 
# words = 8759 + 1 # max(num_unique_word_tokens)
batchSize = 2
log_dir = "train"
model_dir = "DAModel"
model_name = "ckpt"

### Run model

def main():
    # tokenize and vectorize text data to prepare for embedding
    train_data = sequences[:75]
    train_labels = label_list[:75]
    dev_data = sequences[75:]
    dev_labels = label_list[75:]
    config = tf.ConfigProto()
    config.gpu_options.per_process_gpu_memory_fraction = 0.4
    
    with tf.Session(config = config) as sess:
        model = DAModel()
        sess.run(tf.global_variables_initializer())
        clip = 2
        saver = tf.train.Saver()
        #writer = tf.summary.FileWriter("D:\\Experimemts\\tensorflow\\DA\\train", sess.graph)
        writer = tf.summary.FileWriter("train", sess.graph)
        counter = 0
        for epoch in range(10):
            for dialogues, labels in minibatches(train_data, train_labels, batchSize):
                _, dialogue_lengthss = pad_sequences(dialogues, 0)
                word_idss, utterance_lengthss = pad_sequences(dialogues, 0, nlevels = 2)
                true_labs = labels
                labs_t, _ = pad_sequences(true_labs, 0)
                counter += 1
                train_loss, train_accuracy, _ = sess.run([model.loss, model.accuracy,model.train_op], feed_dict = {model.word_ids: word_idss, model.utterance_lengths: utterance_lengthss, model.dialogue_lengths: dialogue_lengthss, model.labels:labs_t, model.clip :clip} )
                #writer.add_summary(summary, global_step = counter)
                print("step = {}, train_loss = {}, train_accuracy = {}".format(counter, train_loss, train_accuracy))
                
                train_precision_summ = tf.Summary()
                train_precision_summ.value.add(
                    tag='train_accuracy', simple_value=train_accuracy)
                writer.add_summary(train_precision_summ, counter)

                train_loss_summ = tf.Summary()
                train_loss_summ.value.add(
                    tag='train_loss', simple_value=train_loss)
                writer.add_summary(train_loss_summ, counter)
                
                if counter % 1 == 0:
                    loss_dev = []
                    acc_dev = []
                    for dev_dialogues, dev_labels in minibatches(dev_data, dev_labels, batchSize):
                        _, dialogue_lengthss = pad_sequences(dev_dialogues, 0)
                        word_idss, utterance_lengthss = pad_sequences(dev_dialogues, 0, nlevels = 2)
                        true_labs = dev_labels
                        labs_t, _ = pad_sequences(true_labs, 0)
                        dev_loss, dev_accuacy = sess.run([model.loss, model.accuracy], feed_dict = {model.word_ids: word_idss, model.utterance_lengths: utterance_lengthss, model.dialogue_lengths: dialogue_lengthss, model.labels:labs_t})
                        loss_dev.append(dev_loss)
                        acc_dev.append(dev_accuacy)
                    valid_loss = sum(loss_dev) / len(loss_dev)
                    valid_accuracy = sum(acc_dev) / len(acc_dev)

                    dev_precision_summ = tf.Summary()
                    dev_precision_summ.value.add(
                        tag='dev_accuracy', simple_value=valid_accuracy)
                    writer.add_summary(dev_precision_summ, counter)

                    dev_loss_summ = tf.Summary()
                    dev_loss_summ.value.add(
                        tag='dev_loss', simple_value=valid_loss)
                    writer.add_summary(dev_loss_summ, counter)
                    print("counter = {}, dev_loss = {}, dev_accuacy = {}".format(counter, valid_loss, valid_accuracy))
if __name__ == "__main__":
    tf.reset_default_graph()
    main()

The data comes from here and looks like this:

[[['what  '],
 ['do you want to start '],
 ['f uh  laughter  you hit  you hit  f uh   '],
 ['it doesnt matter  '],
 ['f um  were discussing the capital punishment i believe '],
 ['right  '],
 ['you are right  '],
 ['yeah  '],
 ['  i  i  suppose i should have '],
 ['f uh  which  '],
 ['i  am  am  pro capital punishment except that i dont like the way its done '],
 ['uhhuh  '],
 ['f uh  yeah  '],
 ['f uh   i  f uh  i  guess  i  i  hate to see anyone die f uh   ']
 ...
 ]]

The dataset to train the model can be found here: https://github.com/cmeaton/Hierarchical_BiLSTM-CRF_Encoder/tree/master/swda_parsed

I'm having a hard time understanding what this error even means and how to approach understanding it. Any advice would be much appreciated. Thanks.

InvalidArgumentError: 2 root error(s) found. Incompatible shapes in Tensorflow text-classification model

Answer

Introduction

Versions

Steps

Result (Includes Many Warnings)

Full Trace

Related questions