ali izadi
ali izadi

Reputation: 549

BigQuery Workflow Notebook Error: "Unable to generate link to the notebook output or the bucket which contains it"

I’m running a notebook on BigQuery as part of a Dataform workflow. My goal is to analyze content using TF-IDF to find internal link opportunities, but when I execute the notebook, I encounter the following error:

Error Message:

Failure reason: Error encountered during cell execution. Notebook output: Unable to generate link to the notebook output or the bucket which contains it.

Here’s the code I’m working with:

import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

df = pd.read_csv('content.csv')

df.head()

def find_internal_link_opportunities(df):
    contents = df['content'].tolist()
    vectorizer = TfidfVectorizer().fit_transform(contents)
    vectors = vectorizer.toarray()
    csim = cosine_similarity(vectors)
    return csim

csim = find_internal_link_opportunities(df)

def display_link_opportunities(df, csim, threshold=0.5):
    results = []

    for idx, row in df.iterrows():
        similar_indices = [i for i, score in enumerate(csim[idx]) if score > threshold and i != idx]

        if similar_indices:
            for i in similar_indices:
                results.append({
                    'url': row['Address'],
                    'sim_url': df.iloc[i]['Address'],
                    'sim_ratio': f"{csim[idx][i]:.2f}"
                })

    # Create a DataFrame to display results in a table format
    result_df = pd.DataFrame(results, columns=['url', 'sim_url', 'sim_ratio'])

    return result_df

final_res = display_link_opportunities(df, csim, threshold=0.2)

I’ve tried the following:

Saving the output in a CSV file. Ensuring I have all the required access and permissions. Setting the correct bucket as the active storage. Despite these efforts, I still encounter the same error: “Unable to generate link to the notebook output or the bucket which contains it.”

Has anyone encountered this issue before or knows how to fix it?

Additional Info: The notebook is part of a Dataform workflow in BigQuery. I have granted the necessary permissions for bucket access. I am able to run the notebook but cannot access the output or logs from the notebook. Any help or insights would be appreciated!

Upvotes: 1

Views: 63

Answers (0)

Related Questions