Feature Request / Ideas: Text For Product Feed From Cornerstone Content

Hi @charlie,

I hope all is well with you. I am not quite sure if this is a feature request or just picking your brain!

We have a few Pro websites which are eCommerce stores and consequently have a Google Merchant Center feed to upload the store products to Google Shopping / Ads. Until recently we were using a (tiresome and laborious) manual .csv feed. However, we have now started using the brilliant WP All Export plugin, which has a dedicated Google Merchant Center output, which can be automated. It works perfectly with Pro, except for one specific and important area - Product Description.

The main Product Description will not output in any form at all. This is presumably due to the Cornerstone shortcodes, their ...id="..." messing with the export / import process. The only description I can successfully export with the feed is the Woocommerce Short Description.

What I wanted to ask is, would it be at all possible either now, or as a future feature of Pro, to enable something like the following: Content is created in Cornerstone, as normal. As or after it has been created, all text is output or duplicated to a separate field (ACF Pro?), which can subsequently be used as a text field to output through the WP All Export feed.

In one example, we have a website which has a large series of 18th Century artwork which is being used for Print on Demand products. As the information about artists, the collection, the birds and animals is repeated across the product ranges, we are using a series of Components to simplify product creation. For example, a ceramic mug product will have a Component called “Product - Mug”, which is inserted as the sole content for each individual Product. “Product - Mug” is created using further sub-Components - “Description - Mug”, “Description - Art Collection”, “Artist Information”, “Mug - Specifications”. Each Product also has a set of ACF Pro fields (Artist Name, Bird Name, Product Name, etc.) which feed through to all of the Components just mentioned, as dynamic content. We can therefore control each product range quite well with information relevant to each type of product. In time this will become a substantial range of products with thousands of potential products and variants, which is why we are harnessing the power of Components and dynamic content. I hope this illustrates why and how getting a Product Description for a product feed is a headache!

Is there, or could there be, a solution to this conundrum - or maybe an even better solution? Dealing with product feeds where content has been created with Cornerstone seems to be the one slight downside I have come across with it. :grin:

Thanks as ever,
Christopher

P.S. There must be many, many Themeco users in a similar situation?

1 Like

I’ve been very well. Hope your summer has been great.

So you want a postmeta field with the generated HTML? Is the issue with Components or just shortcodes in general? This came up trying to fix Yoast, and it would’ve been a lot cleaner if there was a field for me to reference, as opposed to doing shortcode gymnastics. There are other plugins affected by this as well. Alternatively there might be a filter or hook to tap into for product description. There’s a similar request on our end that came up a while ago. The request was to store HTML in post content and store shortcodes as postmeta. There’s several plugins that look at our content expecting shortcodes so I would probably just create a new postmeta field for you. Have a great weekend!

Hi @charlie,

It has been a warm few weeks here in the UK, for a change, so we have seen a bit of summer!

Neither Components nor shortcodes are the actual issue. The issue is that from the admin area of the website, anything created with Cornerstone is only accessible as shortcodes if I try to export it, rather than text or HTML. Only when the content is rendered on the front end does it appear as HTML, if that makes sense.

In short, what I am trying to get is to get raw content for a product feed. A good example would be, if you create a Woocommerce Product in the standard way (i.e. adding a description with Woocommerce’s text editor, not Cornerstone) and use Woocommerce Product export function, and export the “Description” field, the full text is exported, even though with characters like \n. This export appears to be fully usable for an export feed like Google Merchant Center. However, the same export process when the description has been created with Cornerstone outputs the Cornerstone shortcodes.

In the below screenshot, compare Column H which is the product’s “Short Description” created with the default Woocommerce text editor. Column I is the main “Description” field where Cornerstone has created the content. You can see the difference between the two sets of content output in the export.

Cornerstone is a fantastic builder and I have been using it for Product descriptions in recent times (as well as Pages and Posts) in preference to the basic Woocommerce text editor - and will continue to do so. Getting usable text for feeds is the only missing bit.

If the HTML / basic text output can be achieved, I will bow to your greater knowledge and thoughts as to how it should be done! The export method I use, WP All Export / WP All Import can access all types of data, including ACF, postmeta, etc…

Many thanks for your help,
Christopher

1 Like

Hi @charlie,

I hope you are well. A final question: is this a feature which may be available soon, or is it a long-term goal? It would be useful to know, so I can make necessary changes to our GMC feeds.

Thanks,
Christopher

Not necessarily long-term. I had some time to take a look today. I guess it depends what you are trying to save / use, since it’s easier on my end to just save the post content in a post meta value and not the entire page. It seems like it would create bloat so I would probably not have it on by default. And then on your end you would set up which key you want to use through the following. This sound like how you want to use it?

add_filter("cs_content_post_meta_key", function() {
  return "cs_content_cache";
});

Hi @charlie,

Yeah, I think that is sort of it. I know it will cause bloat, but it looks to be the only way to get clean text for the product feeds for any content created with Cornerstone (see screenshots at bottom of this post).

I have double-checked and the output content for any feeds should ideally be in ASCII format and can include HTML.

I have two pieces of code which work and also convert text to ASCII, but with one exception - getting the source for the HTML for the product description without any Cornerstone shortcodes.

This first piece of code writes post meta:

add_action('save_post_product', 'save_description_as_post_meta', 10, 3);

function save_description_as_post_meta($post_id, $post, $update) {
    // Check for autosave.
    if (defined('DOING_AUTOSAVE') && DOING_AUTOSAVE) {
        return $post_id;
    }

    // Check permissions
    if ('page' == $_POST['post_type']) {
        if (!current_user_can('edit_page', $post_id)) {
            return $post_id;
        }
    } else {
        if (!current_user_can('edit_post', $post_id)) {
            return $post_id;
        }
    }

    // Get the description from the product post content
    $description = $post->post_content;

    if (empty($description)) {
        return $post_id;
    }

    // Convert the description to ASCII
    $ascii_description = iconv('UTF-8', 'ASCII//TRANSLIT', $description);

    // Save the description as post meta.
    // Replace 'your_meta_key' with the desired meta key.
    update_post_meta($post_id, '_prod_feed_desc', $ascii_description);
}

The second piece of code is similar, but writes the description to an ACF Pro field (i.e. a method to make it visible on the product’s admin page). In this example, field_649d437490f8a is the key of my ACF Pro field.

add_action('save_post_product', 'save_description_as_acf', 10, 3);

function save_description_as_acf($post_id, $post, $update) {
    // Check for autosave.
    if (defined('DOING_AUTOSAVE') && DOING_AUTOSAVE) {
        return $post_id;
    }

    // Check permissions
    if ('page' == $_POST['post_type']) {
        if (!current_user_can('edit_page', $post_id)) {
            return $post_id;
        }
    } else {
        if (!current_user_can('edit_post', $post_id)) {
            return $post_id;
        }
    }

    // Get the description from the product post content
    $description = $post->post_content;

    if (empty($description)) {
        return $post_id;
    }

    // Convert the description to ASCII
    $ascii_description = iconv('UTF-8', 'ASCII//TRANSLIT', $description);

    // Update the ACF field with the ASCII description.
    // Using the key of your ACF field.
    update_field('field_649d437490f8a', $ascii_description, $post_id);
}

This last piece of code populates the ACF field as in the screenshot:

2023-06-29_12-51-00

What I would hope to see is more like the screenshot below, which is the actual raw content contained within the Cornerstone shortcodes of the first screenshot:

2023-06-29_15-22-29

In both examples, the post meta or ACF Pro field are updated whenever the Cornerstone Save button is clicked (not the main WordPress product update button).

I hope all this makes sense!
Christopher

Here is one further example which gets data, but still not the actual page text. In this case it looks like all page meta settings.

add_action('save_post_product', 'save_description_as_acf', 10, 3);

function save_description_as_acf($post_id, $post, $update) {
    if (defined('DOING_AUTOSAVE') && DOING_AUTOSAVE) {
        return $post_id;
    }

    if ('page' == $_POST['post_type']) {
        if (!current_user_can('edit_page', $post_id)) {
            return $post_id;
        }
    } else {
        if (!current_user_can('edit_post', $post_id)) {
            return $post_id;
        }
    }

    $content = get_post_meta($post_id, apply_filters("cs_content_post_meta_key", ""), true);

    if (!is_string($content)) {
        if (is_array($content) || is_object($content)) {
            $content = json_encode($content);
        } else {
            return $post_id;
        }
    }

    if (!function_exists('iconv')) {
        return $post_id;
    }

    $ascii_content = @iconv('UTF-8', 'ASCII//TRANSLIT', $content);
    
    if ($ascii_content === false) {
        return $post_id;
    }

    update_field('field_649d437490f8a', $ascii_content, $post_id);
}

Its output looks like the below:

Okay so you need this data on a save request? I have to change my initial idea a bit for that, but I think we can still do that for you.

And just to clarify too. Using do_shortcodes would solve this, it’s just that this isn’t very feasible when you want to reference this in multiple places? Example.

    $description = do_shortcode($post->post_content);

Hi @charlie,

I am not particularly fussed how to get the data. All I am hoping to achieve is to get a clean HTML copy of each product’s main description which can be used in a product feed, just like standard Gutenberg descriptions can be used.

The three functions in my previous reply were just me tinkering around, trying to find a sort of solution! I am not very adept at coding.

Thanks again,
Christopher

Code looks fine to me. Yeah I think if you just add do_shortcode that should process Cornerstone content to HTML. I’d love to see a working example of what your trying to do so I can see if there is anything to note before I start finishing this. If you don’t have the time that’s okay I will probably look at this again around the time of 6.3.0 release.

I would LOVE to be able to crack this nut, but my skills are very limited - the code examples were generated by ChatGPT from detailed prompts :wink:! I would like to think I was genius enough to write the code, but not a hope! At the least, I hoped the code may be of use to you.

Joking aside, what I could not find was a method to get the Cornerstone output from the meta data in the database and strip out everything but the content and HTML. I am certain there will be tricks you know to do that.

Have a great weekend,
Christopher

P.S. Any ideas or pointers with do_shortcode and I will see what I can dig up - or not!

@charlie - got one for you in the Secure Note!

Hello. Thanks for the kind words. The rest is in a secure note.

No problem! More in secure note.

Do you want to try this filter out? On save it should add to post meta _cs_built_html. This is about what I used, when I did something on another project. Let me know if this is what you were looking for.

// On Document save, update post meta field
// _cs_built_html with an HTML copy of
// the post
add_filter("cs_save_document", function($doc) {
    try {
        ob_start();

        // Needed For styling
        do_action("wp_enqueue_scripts");

        // Find document and renderContentFromDocument
        $resolver = cornerstone("Resolver");
        $docDB = $resolver->getDocument( $doc->id() );
        $html = cornerstone("Resolver")->renderContentFromDocument($docDB);

        $html.= ob_get_clean();

        // Update post meta with HTML
        update_post_meta($doc->id(), '_cs_built_html', $html);
    } catch(\Throwable $e) {
        trigger_error("Error with document HTML post meta saving : " . $e->getMessage());
    }
});

Thank you so much! That is very close to what I am looking for.

The slight amendment is that only these HTML tags are allowed in the output: <ul> , <ol> , <li> , <br> , <p> , and <b> and that the CSS contained in the <style></style> section is not permitted. Is it possible to output just the pure HTML, excluding the CSS stylesheet in the <style></style> section of the current output? I am not worried about heading tags or classes named withing HTML tags, as they will be converted to <p> tags when uploaded to Google Merchant Center.

A second question is about bulk updating. The function obviously works at Cornerstone Save, as opposed to core WP post update. Is there a way to bulk update all products (or other posts), so that _cs_save_document is populated in bulk (I have several hundred products/variations!)??

What you have sent through is a major breakthrough though!

Many many thanks,
Christopher

Hi @charlie,

Thanks to your help yesterday, I think I have now achieved what I was looking for, having altered your code with AI help. I consequently now have clean pure HTML, stripped of classes and styles. Here is the amended code:

add_filter("cs_save_document", function($doc) {
    try {
        ob_start();

        // Needed For styling
        do_action("wp_enqueue_scripts");

        // Find document and renderContentFromDocument
        $resolver = cornerstone("Resolver");
        $docDB = $resolver->getDocument( $doc->id() );
        $html = cornerstone("Resolver")->renderContentFromDocument($docDB);

        $html .= ob_get_clean();

        // Attributes to remove
        $attributes = [
            'class', 'style', 'data-x-effect', 'data-x-slide-container', 'loading', 'width', 'height',
            'aria-hidden', 'tabindex', 'data-x-effect-provider', 'data-x-slide-context', 'data-x-slide',
            'aria-expanded', 'id', 'role', 'aria-selected', 'aria-controls', 'data-x-toggle', 'data-x-toggleable',
            'data-x-toggle-collapse', 'aria-labelledby'
        ];

        // Remove attributes
        foreach ($attributes as $attribute) {
            $html = preg_replace('/\s*' . $attribute . '\s*=\s*".*?"/i', '', $html);
        }

        // Remove content within certain tags
        $tags = ['style', 'iframe', 'script', 'i', 'blockquote'];
        foreach ($tags as $tag) {
            $html = preg_replace('/<' . $tag . '\b[^>]*>.*?<\/' . $tag . '>/is', '', $html);
        }

       // Remove span tags
       $html = preg_replace('~</?span[^>]*>~i', '', $html);

        // Convert special characters to ASCII
        $html = iconv('UTF-8', 'ASCII//IGNORE', $html);

        // Remove all div tags
        $html = preg_replace('~</?div[^>]*>~i', '', $html);

        // Update post meta with HTML
        update_post_meta($doc->id(), '_cs_built_html', $html);
    } catch(\Throwable $e) {
        trigger_error("Error with document HTML post meta saving : " . $e->getMessage());
    }
});

In addition, I have a small plugin to bulk generate the HTML into the meta data. It is a bit rough and ready, but does the trick though! The plugin code is uploaded to a new folder with the same name as the plugin file in wp-content/plugins and is then activated as a normal plugin. Below is the plugin code:

<?PHP
/**
* Plugin Name: Cornerstone Bulk HTML Generator
* Description: This is a custom plugin to update all posts and generate HTML for Cornerstone.  
* Version: 1.0
* Author: White Media
* Author URI: https://www.whitemedia.uk/
*/

add_action('admin_menu', 'cornerstone_bulk_html_generator_menu');

function cornerstone_bulk_html_generator_menu() {
    add_menu_page(
        'Cornerstone Bulk HTML Generator', 
        'Bulk HTML Generator', 
        'manage_options', 
        'cornerstone-bulk-html-generator', 
        'cornerstone_bulk_html_generator_page', 
        '', 
        6
    );
}

function cornerstone_bulk_html_generator_page() {
    // Get all public post types
    $post_types = get_post_types(array('public' => true));

    $continue = get_option('cornerstone_bulk_html_generator_continue', false);
    $button_color = $continue ? '#0073aa' : '#ddd';
    $button_disabled = $continue ? '' : 'disabled';
    ?>
    <div class="wrap">
        <h1>Cornerstone Bulk HTML Generator</h1>
        <form method="post" action="">
            <select name="post_type">
                <option value="">All Post Types</option>
                <?PHP
                foreach ($post_types as $post_type) {
                    echo '<option value="' . $post_type . '">' . $post_type . '</option>';
                }
                ?>
            </select>
            <input type="submit" name="update_posts" value="Update Posts" class="button button-primary" />
        </form>
        <form method="post" action="" style="margin-top: 20px;">
            <input type="submit" name="continue_posts" value="Continue" class="button button-primary" style="background-color: <?php echo $button_color; ?>" <?php echo $button_disabled; ?> />
        </form>
    </div>
    <?PHP
    
        if (isset($_POST['update_posts'])) {
            update_option('cornerstone_bulk_html_generator_post_type', $_POST['post_type']);
            cornerstone_bulk_html_generator_update_all_posts($_POST['post_type']);
    }

    if (isset($_POST['continue_posts']) && $continue) { cornerstone_bulk_html_generator_update_all_posts(get_option('cornerstone_bulk_html_generator_post_type', ''));
    }
}

function cornerstone_bulk_html_generator_update_all_posts($post_type = '') {
    $batch_size = 50;
    $offset = (int) get_option('cornerstone_bulk_html_generator_offset', 0);

    $args = array(
        'numberposts' => -1,
        'offset' => $offset,
        'post_type'   => $post_type ? $post_type : get_post_types(array('public' => true)),
    );

    $total_posts = count(get_posts($args));

    $args['numberposts'] = $batch_size;
    $posts = get_posts($args);

    if (empty($posts)) {
        echo '<div class="updated"><p>All posts updated successfully.</p></div>';
        update_option('cornerstone_bulk_html_generator_offset', 0);
        update_option('cornerstone_bulk_html_generator_continue', false);
        return;
   }

    foreach($posts as $post) {
        try {
            ob_start();

            do_action("wp_enqueue_scripts");

            $resolver = cornerstone("Resolver");
            $docDB = $resolver->getDocument($post->ID);
            $html = cornerstone("Resolver")->renderContentFromDocument($docDB);

            $html .= ob_get_clean();

            // Attributes to remove
            $attributes = [
                'class', 'style', 'data-x-effect', 'data-x-slide-container', 'loading', 'width', 'height',
                'aria-hidden', 'tabindex', 'data-x-effect-provider', 'data-x-slide-context', 'data-x-slide',
                'aria-expanded', 'id', 'role', 'aria-selected', 'aria-controls', 'data-x-toggle', 'data-x-toggleable',
                'data-x-toggle-collapse', 'aria-labelledby'
            ];

            // Remove attributes
            foreach ($attributes as $attribute) {
                $html = preg_replace('/\s*' . $attribute . '\s*=\s*".*?"/i', '', $html);
            }

            // Remove content within certain tags
            $tags = ['style', 'iframe', 'script', 'i', 'blockquote'];
            foreach ($tags as $tag) {
                $html = preg_replace('/<' . $tag . '\b[^>]*>.*?<\/' . $tag . '>/is', '', $html);
            }

            // Remove span tags
            $html = preg_replace('~</?span[^>]*>~i', '', $html);

            // Convert special characters to ASCII
            $html = iconv('UTF-8', 'ASCII//IGNORE', $html);

            // Remove all div tags
            $html = preg_replace('~</?div[^>]*>~i', '', $html);

            update_post_meta($post->ID, '_cs_built_html', $html);
        } catch(\Throwable $e) {
            trigger_error("Error with document HTML post meta saving : " . $e->getMessage());
        }
    }

    update_option('cornerstone_bulk_html_generator_offset', $offset + $batch_size);
    update_option('cornerstone_bulk_html_generator_continue', true);

    $progress = min(100, (($offset + $batch_size) / $total_posts) * 100);

    echo '<div class="updated"><p>Updated ' . count($posts) . ' posts. </p></div>';
    echo '<div style="width: 100%; background: #f5f5f5; border: 1px solid #ddd; margin: 20px 0; height: 20px; display: flex; align-items: center;"><div style="height: 100%; background: #0073aa; width:' . $progress . '%;"></div></div>';
}
?>

Please let me know if this works for you as well.

Many thanks again,
Christopher

Very cool code generation. It tried to brute force a little, but I think I can help you two out.

I’m pretty sure if you remove this line it won’t output styling.

        // Needed For styling
        do_action("wp_enqueue_scripts");

It seems you don’t really want many tags at all. I would try this function and see if it does better. Placing around the ob_get_clean() line.

        $html.= ob_get_clean();
        $html = strip_tags($html, "<br><p>"); // Add valid tags

https://www.php.net/manual/en/function.strip-tags.php

Hi @charlie,

A bit brutal, true, but something I have no hope of achieving myself! Lots of revisions through new prompts, hence probably a bit clunky. I had spotted the wp_enqueue_scripts line, but kept it in for the time being in case it interacted with other changes I had asked for.

You are right, all I really wanted for this exercise was to have a clean basic HTML as possible - no classes, styles, or anything else - just headers, paragraphs, line breaks and anything else accepted by feeds.

That said, I now see potential for all sorts of other uses, having seen the output of your original code. Funny enough, my “objector” client is over from the Far East tomorrow, so your ideas are most perfectly timed to give them a running demonstration of “their fears” about builders. Cool stuff.

More in Secure Note,

Thanks,
Christopher

I think you need to elevate permissions on that site for me to edit the child theme. I would try changing the export to add the line about $GLOBALS here if you aren’t seeing dynamic data populate.

        // Find document and renderContentFromDocument
        $GLOBALS['post'] = get_post($doc->id());
        $resolver = cornerstone("Resolver");